Weird Strtok Issue

Forum

Forum
General C++ Programming
Weird Strtok Issue

Weird Strtok Issue


int search(char* keyw)
{
	char* 				tkn;					
	int 				tokCnt=0;						    unsigned char* 		result;
	char 				delims[] = " ,.:;_\n\r\t*-=()";

		/* Tokenizes word parameter to search */
	//tkn= strtok( kw, ",");	
	tkn=strtok(kw,"' ' ,.:;_\n\r\t*-=()");
	while (tkn != NULL) {					// tokenizes kw and adds tokens to vector kWords
    		printf("tkn %s: \n", tkn );
    		kWords.push_back(tkn);	
    		//tkn=strtok (NULL, ",");
    		tkn=strtok(NULL, "' ' ,.:;_\n\r\t*-=()");
    		tokCnt++;
  	}

When I input "./client --search hello sick papa dont " all in the terminal it doesn't print anything, but because of code I haven't shown hello is read.
Output:

./client --search hello sick papa dont
tkn:

Last edited on

JLBorges (13770)

C++:

#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <iterator>

std::vector<std::string> tokenize( std::string keyw )
{
    const std::string delims = ",.:;_\n\r\t*-=()" ;

    // replace each delimiter with a space
    for( char& c : keyw ) if( delims.find(c) != std::string::npos ) c = ' ' ;

    // construct an input stream which reads from the string
    std::istringstream stm(keyw) ;

    // read whitespace seperated tokens from the stream into a vector and return it
    return { std::istream_iterator<std::string>(stm),
             std::istream_iterator<std::string>() } ;
}

int main()
{
    for( const auto& s : tokenize( "./client --search hello sick papa dont " ) )
        std::cout << s << '\n' ;
}

http://ideone.com/05pee2

knowNothing (65)

Thank you.
Is there something less 'heavy'?

Your solution uses vectors.

JLBorges (13770)

> Your solution uses vectors.

What is wrong about using vectors?

And what was this? kWords.push_back(tkn); Isn't kWords a sequence container?

knowNothing (65)

Yeah, but that vector will be used for very important things. It is necessary to deal with a lot of information.

There is nothing wrong with them in general. But, I am looking for something less heavy resource wise for this "trivial" tokenization. I am looking for something light weight that would get the job done.

Thank you for the help so far, don't get me wrong.

JLBorges (13770)

> I am looking for something less heavy resource wise for this "trivial" tokenization.
> I am looking for something light weight that would get the job done.

If a one-time scan of the strings for tokens is all that is needed, and the tokens need not be extracted and stored in a sequence container for later use, it is hard to beat boost::tokenizer in either time or space.
http://www.boost.org/doc/libs/1_53_0/libs/tokenizer/char_separator.htm

std::vector<std::string> is moveable and does not incur an 'unnecessary-copy' performance penalty.

If the parsed tokens need to be extracted and stored, using boost::split to parse the tokens into a vector of references (iterator_range) would be a high performance option.
(The references would be to positions in the original string; that string needs to be kept).
http://www.boost.org/doc/libs/1_53_0/doc/html/string_algo/usage.html#idp163440592

Last edited on

knowNothing (65)

Thank you. I don't know much about C++, but you have given me good advice.
I will use your first option and not boost.

Thank you again.

Topic archived. No new replies allowed.