Reading a string from text file

May 30, 2018 at 7:11pm
Hi everyone!
I have a problem when working with string in text file.
For example, I have a text:
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799590" cy="1799590"/><wp:extent cx="5486400" cy="3200400"/>

I 'd like to read from the beginning to the end of text file and get values of cx and cy.
What should I do to do that task?
Thanks for your help!
Last edited on May 30, 2018 at 7:12pm
May 30, 2018 at 7:39pm
Tokenize on the angle brackets and then tokenize on the substring 'cx=' and 'cy='.
May 30, 2018 at 8:12pm
Ideally you'd prob want to use an XML parser library, then iterate over each Node with name "wp:extent" , then extract values of the attributes "cx" , "cy"
May 30, 2018 at 8:45pm
I'm not very experienced with XML files so I copied this text to .txt file.
May 30, 2018 at 8:47pm
slepeckypes wrote:
I'm not very experienced with XML files so I copied this text to .txt file.
That's silly. Not only do lots of editors have XML syntax support, so that looking at it, the data seems organized and readable, but there are many perfectly good XML parsers out there, too, in almost any programming language. They'll extract any data you want in an instant.
May 30, 2018 at 11:10pm
Ok I'm a goof but I took a stab at it
This works if all the values are 7 digits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <fstream>
#include <iostream>
using namespace std;

int linecount=1;
string line;

int main (int argc, char *argv[])
{
    if (argc!=2)
    {
        cout << "Incorrect arguments\n";
    }
	else
	{
	ifstream inputfile (argv[1]);
    if (inputfile.is_open())
	{

cout << "CX" << "\tCY" << endl;

	while ( inputfile.good())
		{
		getline (inputfile,line);
		if (!line.empty())
			{

			for (int i =0; i <= line.size(); i++)
				{
					
					if ((line[i] == 'c') && (line[i+1] == 'x')  && (line[i+2] == '='))
						{
			//				cout << "CX= found at "	<< i << endl;
							cout << line [i+4]<< line [i+5]<< line [i+6]<< line [i+7]<< line [i+8]<< line [i+9]<< line [i+10];
						}
		
				if ((line[i] == 'c') && (line[i+1] == 'y')  && (line[i+2] == '='))
						{
			//				cout << "CY= found at "	<< i << endl;
							cout << "\t" << line [i+4]<< line [i+5]<< line [i+6]<< line [i+7]<< line [i+8]<< line [i+9]<< line [i+10] << endl;
						}
				}
			}
		}

	inputfile.close();
    }
	else
		{
		        cout << "File not open" << endl;;
		}
	}

return 0;
}


The output should look like this

C:\Temp>test123 cxy.txt
CX CY
1799590 1799590
5486400 3200400
1799591 1799591
5486401 3200401
1799592 1799592
5486402 3200402


Oh yea, my test file

<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799590" cy="1799590"/><wp:extent cx="5486400" cy="3200400"/>
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799591" cy="1799591"/><wp:extent cx="5486401" cy="3200401"/>
<w:drawing><wp:inline distT="0" distB="0" distL="0" distR="0"><wp:extent cx="1799592" cy="1799592"/><wp:extent cx="5486402" cy="3200402"/>
Last edited on May 30, 2018 at 11:12pm
May 30, 2018 at 11:33pm
There are many powerful tools to search for a specific patterns with regular expressions. You can use 'awk' for example which let you do the task you are asking for and even much more. You can learn more about it here https://www.gnu.org/software/gawk/
Last edited on May 30, 2018 at 11:33pm
May 31, 2018 at 11:27am
This works if all the values are 7 digits.

Values are from 6 to 8 digits
May 31, 2018 at 4:47pm
Then just modify the code to get all digits between the quotes.

If the numbers you want are all in the same column space that would be another way to just grab a certain range. Since I don't have your data file I can't say.
Last edited on May 31, 2018 at 4:51pm
May 31, 2018 at 10:02pm
slepeckypes, don't be afraid of trying libraries. Trust me, step 1 is choosing an XML library. A lot of them are lightweight, too. For example, https://github.com/zeux/pugixml

You'll look back on this and be like, "oh, all i needed to do was include this one header file, compile my lib against that one, and write <10 lines ...?"
Jun 1, 2018 at 9:09am
I agree with icy1. A xml library is the easiest way to go. pugixml is easy to use and well documented.
However it would require a complete well-formed and valid xml document.
With the snippet you showed every xml lib will fail.
If you have only a snippet then it would be better to follow SamuelAdams way.
Jun 1, 2018 at 4:52pm
I agree with everyone above. There are much better and faster ways but given a one line example. I hacked it. First idea I came up with worked so.... I left it there.
Last edited on Jun 1, 2018 at 4:54pm
Jun 1, 2018 at 10:24pm
This is what I would do, String operations, find "cx="" and read every number that follows, same for cy.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
#include <iostream>
#include <fstream>
#include <string>

using namespace std;

void pause();



int main()
{

	fstream file;
	
	char foundCharacter;
	
	int position;
	
	string temporalString = "",
		   fileString,
		   cx1,
		   cy1,
		   cx2,
		   cy2;
		   
	
	file.open("readFromFile.txt");
	
	if (file.fail())
		exit(1);
	else
		cout << "\nFile successfully opened!";
	
		
	getline(file, fileString); //read string from file
	
	
	//CX1
	position = fileString.find("cx=\"", 0 ); //find first position of first "cx=" in string
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	cx1 = temporalString; //set first cx result
	
	
	//CX2
	position = fileString.find("cx=\"", position ); //find next position of "cx=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	cx2 = temporalString;
	
	
	//CY1
	position = fileString.find("cy=\"", 0 ); //find next position of first "cy=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	
	cy1 = temporalString;
	
	
	//CY2
	position = fileString.find("cy=\"", position ); //find next position of "cy=" in string
	temporalString.clear(); //clear temporalString
	foundCharacter = '\0'; //clear character	
	position += 4; //increment position to skip " sign;
	
	
	for (position; foundCharacter != '\"' ; position ++) //read every character that's a number untill " is found
	{
		foundCharacter = fileString.at(position);
		
		if (foundCharacter != '\"')
			temporalString.push_back(foundCharacter);
		else
			break;
	}
	
	
	cy2 = temporalString;
	
	
	cout << "\n\ncx1 = " << cx1
		 << "\ncy1 = " << cy1
		 << "\ncx2 = " << cx2
		 << "\ncy2 = " << cy2;
	
		
	pause();
	
	
	return(0);
}


void pause()
{
	std::cout << "\n\n\n Press enter to continue...";
	std::cin.sync();
	std::cin.get();
	std::cout << "\n\n";
	
	return;
}


Hope this helps,

Regards,

Hoogo;
Topic archived. No new replies allowed.