Separate column values from a string

Hi everyone.
I read lines from a textual file with getline().
Each line is a string like:12 345 543 65

There is a way to obtain substrings with only one value, for example "345" (second column), avoiding blanks or tabs between values?
Just like
cat mySTring | awk '{print $2}'
in bash scripting?

Thanks in advance.
closed account (zb0S216C)
You can try splitting your string into tokens with strtok( )[1].

References:
[1]http://www.cplusplus.com/reference/clibrary/cstring/strtok/


Wazzak
strtok() is not properly C++ oriented...other solutions?
strtok() is not properly C++ oriented

That makes no sense. What is your issue with using strtok?
1
2
3
4
std::string str = "12 34 56 78";

std::vector<std::string> splitted;
boost::split(splitted, str, boost::is_space() );


http://www.boost.org/doc/libs/1_46_1/doc/html/string_algo/usage.html#id2728530
Thanks R0mai, that's nice!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <boost/regex.hpp>
#include <boost/algorithm/string_regex.hpp>
//...
vector<string> splitted;
boost::split(splitted, "  12   43 67 8 9876", is_space() );

cout << "Split: "<< splitted.at(0) << endl;
cout << "Split: "<< splitted.at(1) << endl;
cout << "Split: "<< splitted.at(2) << endl;
cout << "Split: "<< splitted.at(3) << endl;
cout << "Split: "<< splitted.at(4) << endl;
cout << "Split: "<< splitted.at(5) << endl;
cout << "Split: "<< splitted.at(6) << endl;
//... 


Split: 
Split: 
Split: 12
Split: 
Split: 
Split: 43
Split: 67


But how to avoid all those empty elements inside the vector (blanks)?

I would like to use trim() and split() boost functions, to emulate the awk syntax presented above, but I don't know how to use them, even reading the documentation...can anyone help me?
Last edited on
I didn't know about split, I use boost::tokenizer, but split looks simpler, so that's pretty neat.

In terms of getting rid of the extra spaces you have many options. Off the top of my head:

1) Use a regex to replace " " (2 spaces) with " " (1 space) until there are no more matches
2) If it's all numbers and whitespace, use an istringstream instead of split
3) Remove all the empty elements of the vector

I would probably go with option 2, but it does depend on your input.

If you prefer 3, the simplest implementation I can think of would be:

1
2
3
std::string blank = "";
std::remove(splitted.begin(), splitted.end(), blank);
while(splitted.back() == blank) splitted.pop_back();
Thanks kev82.

Could you please give me simple example implementations of options 1 and 2?

Thank you so much!
1
2
3
4
5
6
7
#include <iostream>
#include <sstream>
//...
std::string str = "12 34 56 78";
std::stringstream tokenizer( str );
int n;
while( tokenizer >> n ) std::cout << n << std::endl;
Last edited on
Thank you moorecm, this is exactly what I need!

It works also if I have random blanks and tabs between values:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <iostream>
#include <string>
#include <sstream>

using namespace std;

int main()
{
	string str = "    1     2    34		56	78    ";
	stringstream tokenizer( str );
	int n;
	while( tokenizer >> n )
		cout << "Token: " << n << endl;

	return 0;
}

Token: 1
Token: 2
Token: 34
Token: 56
Token: 78


This is really C++ oriented!
Last edited on
Topic archived. No new replies allowed.