What is the good way in this string processing?

Hi.

When you need to split the following string and store in struct, what would be your way to split it?

// string example
// This string is from a text file.
{Name:Jhon, Age:20, Sex:Male}, {Name:Kyle, Age:25, Sex:Male}

1
2
3
4
5
6
struct PersonInfo
{
  string Name;
  int Age;
  string Sex;
};


1. Get the string between '{' and '}' and store in array like a vector.
-> vec[0] == "Name:Jhon, Age:20, Sex:Male"

2. In here, you have two choice.
2-1. Get the string between attribute string and comma. like between "Name:" and ',' or '\0'(for the last attribute)

2-2. Get the string between just ':' and ',' or '\0'

If the text file perfectly follow the order of attribute, 2-2 way is good.

But if not, you can't expect good result.

Should I consider this problem and do I have to choose the 2-1 way?

Or If there is a better way, plz let me know!!
Last edited on
The data looks almost (but not quite) JSON.
Perhaps JSON-parsers could be used/learned from: https://linuxhint.com/parse-json-data-c/
@keskiverto

Thank you for your answer. But it's not the answer I expected. My purpose is just about the CPP studying and for a test..
Perhaps assuming data is always in the same order and present:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <limits>

struct PersonInfo
{
	std::string Name;
	int Age {};
	std::string Sex;
};

int main()
{
	const std::string inp {"{Name:Jhon, Age:20, Sex:Male}, {Name:Kyle, Age:25, Sex:Male} , {Name:foo bar, Age:32, Sex:Female}"};

	std::istringstream iss(inp);
	std::vector<PersonInfo> vpi;

	for (std::string elem; getline(iss, elem, '}'); ) {
		PersonInfo pi;
		const std::string sv {elem.substr(elem.rfind('{') + 1)};
		std::istringstream iel(sv);
		std::string tmp;

		getline(iel, pi.Name, ':');
		getline(iel, pi.Name, ',');
		getline(iel, tmp, ':');
		iel >> pi.Age;
		iel.ignore(std::numeric_limits<std::streamsize>::max(), ',');
		getline(iel, pi.Sex, ':');
		getline(iel, pi.Sex, ',');

		vpi.push_back(pi);
	}

	for (const auto& [name, age, sex] : vpi)
		std::cout << name << ' ' << age << ' ' << sex << '\n';
}



Jhon 20 Male
Kyle 25 Male
foo bar 32 Female

Last edited on
@seeplus

Thank you! So maybe I need to assume that data is ordered. btw, why did you use rfind instead of just find?
You don't need to assume the data is always in the same order, it just makes parsing more complicated because you would need to split the string on ':' and see if the prefix is Name, Age, or Sex and handle each case accordingly.
why did you use rfind instead of just find?


Because the getline() splits on } and then I want to find the corresponding { which means reverse find.

so maybe I need to assume that data is ordered


The code above assumes that and also that all the data is always present. If this is not the case then as Ganado says above, parsing then becomes more complicated. What I posted above was my simple way of dealing with the data as shown.

Last edited on
if the data isn't in the same order, then something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <limits>

struct PersonInfo
{
	std::string Name;
	int Age {};
	std::string Sex;
};

int main()
{
	const std::string inp {"{Name:Jhon, Age:20, Sex:Male}, {Name:Kyle, Age:25, Sex:Male} , {Name:foo bar, Age:32, Sex:Female}"};

	std::istringstream iss(inp);
	std::vector<PersonInfo> vpi;

	for (std::string elem; getline(iss, elem, '}'); ) {
		PersonInfo pi;
		const std::string sv {elem.substr(elem.rfind('{') + 1)};
		std::istringstream iel(sv);
		std::string tmp;

		while (getline(iel, tmp, ':')) {
			const auto type {tmp.substr(tmp.find_first_not_of(' '))};

			if (type == "Name")
				getline(iel, pi.Name, ',');
			else if (type == "Age") {
				iel >> pi.Age;
				iel.ignore(std::numeric_limits<std::streamsize>::max(), ',');
			} else if (type == "Sex")
				getline(iel, pi.Sex, ',');
			else
				getline(iel, tmp, ',');
		}
		vpi.push_back(pi);
	}

	for (const auto& [name, age, sex] : vpi)
		std::cout << name << ' ' << age << ' ' << sex << '\n';
}

Thank you guys! It helped me a lot!
Topic archived. No new replies allowed.