Tokenizing a string
Mar 3, 2014 at 3:52am Mar 3, 2014 at 3:52am UTC
I have this problem (homework):
Write one program that receive a string, tokenize it by ' ' delimiter and show the output, using STL
So, I write:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
#include <stack>
#include <iostream>
#include <vector>
#include <string>
using std::stack;
using std::cout;
using std::cin;
using std::vector;
using std::string;
using std::getline;
vector<string> split(string str)
{
vector<string> ret;
if (str.find(' ' ) == string::npos) ret.push_back(str);
else
{
for (unsigned i = 0; i < str.size(); i++)
{
if (str[i] != ' ' )
{
string word = str.substr(i, str.find(' ' , i));
ret.push_back(word);
}
}
}
return ret;
}
int main()
{
string input;
std::cout << "Type any string and I'll say its proprieties:\n" ;
while (true )
{
cout << "> " ;
getline(cin, input);
std::cout << "\nString length: " << input.size() << "\nWords number: " << split(input).size() << "\n" ;
for (unsigned i = 0; i < split(input).size(); i++)
{
std::cout << "Word " << i + 1 << ": (split(input)[" << i << "])\n" << split(input)[i] << std::endl;
}
}
}
And...
1 2 3 4 5 6 7 8 9 10 11 12
Type any string and I'll say its proprieties:
> A B C
String length: 5
Words number: 3
Word 1: (split(input)[0])
A
Word 2: (split(input)[1])
B C
Word 3: (split(input)[2])
C
>
What am I doing wrong? Please, explain the problem, and (I saw that by googling) if you are going to use "iterators", please, explain this too.
Mar 3, 2014 at 4:13am Mar 3, 2014 at 4:13am UTC
Using a string stream is perhaps easier.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
#include <iostream>
#include <vector>
#include <string>
#include <sstream>
std::vector<std::string> split( std::string str, char sep = ' ' )
{
std::vector<std::string> ret ;
std::istringstream stm(str) ;
std::string token ;
while ( std::getline( stm, token, sep ) ) ret.push_back(token) ;
return ret ;
}
Mar 3, 2014 at 9:29am Mar 3, 2014 at 9:29am UTC
What is token?
Mar 3, 2014 at 10:01am Mar 3, 2014 at 10:01am UTC
In this case, for each iteration of the loop, 'token' ends up being each substring up until the next instance of 'sep'.
Topic archived. No new replies allowed.