int search(char* keyw)
{
char* tkn;
int tokCnt=0; unsignedchar* result;
char delims[] = " ,.:;_\n\r\t*-=()";
/* Tokenizes word parameter to search */
//tkn= strtok( kw, ",");
tkn=strtok(kw,"' ' ,.:;_\n\r\t*-=()");
while (tkn != NULL) { // tokenizes kw and adds tokens to vector kWords
printf("tkn %s: \n", tkn );
kWords.push_back(tkn);
//tkn=strtok (NULL, ",");
tkn=strtok(NULL, "' ' ,.:;_\n\r\t*-=()");
tokCnt++;
}
When I input "./client --search hello sick papa dont " all in the terminal it doesn't print anything, but because of code I haven't shown hello is read.
Output:
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <iterator>
std::vector<std::string> tokenize( std::string keyw )
{
const std::string delims = ",.:;_\n\r\t*-=()" ;
// replace each delimiter with a space
for( char& c : keyw ) if( delims.find(c) != std::string::npos ) c = ' ' ;
// construct an input stream which reads from the string
std::istringstream stm(keyw) ;
// read whitespace seperated tokens from the stream into a vector and return it
return { std::istream_iterator<std::string>(stm),
std::istream_iterator<std::string>() } ;
}
int main()
{
for( constauto& s : tokenize( "./client --search hello sick papa dont " ) )
std::cout << s << '\n' ;
}
Yeah, but that vector will be used for very important things. It is necessary to deal with a lot of information.
There is nothing wrong with them in general. But, I am looking for something less heavy resource wise for this "trivial" tokenization. I am looking for something light weight that would get the job done.
Thank you for the help so far, don't get me wrong.
> I am looking for something less heavy resource wise for this "trivial" tokenization.
> I am looking for something light weight that would get the job done.
If a one-time scan of the strings for tokens is all that is needed, and the tokens need not be extracted and stored in a sequence container for later use, it is hard to beat boost::tokenizer in either time or space. http://www.boost.org/doc/libs/1_53_0/libs/tokenizer/char_separator.htm
std::vector<std::string> is moveable and does not incur an 'unnecessary-copy' performance penalty.
If the parsed tokens need to be extracted and stored, using boost::split to parse the tokens into a vector of references (iterator_range) would be a high performance option.
(The references would be to positions in the original string; that string needs to be kept). http://www.boost.org/doc/libs/1_53_0/doc/html/string_algo/usage.html#idp163440592