Tokenizing a string

Mar 3, 2014 at 3:52am
I have this problem (homework):
Write one program that receive a string, tokenize it by ' ' delimiter and show the output, using STL

So, I write:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#include <stack>
#include <iostream>
#include <vector>
#include <string>
using std::stack;
using std::cout;
using std::cin;
using std::vector;
using std::string;
using std::getline;

vector<string> split(string str)
{
	vector<string> ret;
	if (str.find(' ') == string::npos) ret.push_back(str);
	else
	{
		for (unsigned i = 0; i < str.size(); i++)
		{
			if (str[i] != ' ')
			{
				string word = str.substr(i, str.find(' ', i));
				ret.push_back(word);
			}
		}
	}

	return ret;
}

int main()
{
	string input;
	std::cout << "Type any string and I'll say its proprieties:\n";
	while (true)
	{
		cout << "> ";
		getline(cin, input);
		std::cout << "\nString length: " << input.size() << "\nWords number: " << split(input).size() << "\n";
		for (unsigned i = 0; i < split(input).size(); i++)
		{
			std::cout << "Word " << i + 1 << ": (split(input)[" << i << "])\n" << split(input)[i] << std::endl;

		}
	}
}


And...
1
2
3
4
5
6
7
8
9
10
11
12
Type any string and I'll say its proprieties:
> A B C

String length: 5
Words number: 3
Word 1: (split(input)[0])
A
Word 2: (split(input)[1])
B C
Word 3: (split(input)[2])
C
> 


What am I doing wrong? Please, explain the problem, and (I saw that by googling) if you are going to use "iterators", please, explain this too.
Mar 3, 2014 at 4:13am
Using a string stream is perhaps easier.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <iostream>
#include <vector>
#include <string>
#include <sstream>

std::vector<std::string> split( std::string str, char sep = ' ' )
{
	std::vector<std::string> ret ;

	std::istringstream stm(str) ;
	std::string token ;
	while( std::getline( stm, token, sep ) ) ret.push_back(token) ;

	return ret ;
}
Mar 3, 2014 at 9:29am
What is token?
Mar 3, 2014 at 10:01am
In this case, for each iteration of the loop, 'token' ends up being each substring up until the next instance of 'sep'.
Topic archived. No new replies allowed.