Arrays

I have several text files of columns of data. Some have three columns, some have four. All of the columns have headers on them. I need to be able to break each test file into an array based on the columns. I need to be able to access the data (doubles) in the columns. I will be using a regular expression to figure out which column has the data I need, based on the header in the array.

Thanks in advance.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
fstream file("...");

string temp;
getline(file, temp);//read the first line with headers

stringstream ss(headers);
int count = 0, index = -1;

for(; ss >> temp; count++)
  if(/*temp is the string you wanted*/) index = count;
//now count shows how many columns you have
//and index shows which one you want
//index = -1 indicates failure

double val;
for(int i = 0; i < index; i++) file >> val;//skip the first several columns

vector<double> values;//you can use an array if you know how long the lists are.
//You'll want to make the following loop into an for loop in that case.

while(file){//while input hasn't failed (due to end of file or a string instead of a number)
   file >> val;
   values.push_back(val);//read and push a value you need into the vector.
   //if you want an array, the two lines above would turn into
   //file >> values[counter]; where counter is the variable of the for loop I mentioned.

   for(int i = 0; i < count-1; i++) file >> val;//skip the remaining entries of this line and some of the other.
}

Something like that. I'm assuming the file doesn't have empty entries. That would make it more complicated..
hamsterman - Thank you. Here is what I have (based on your code above):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
		while (!datFile.eof())
		{
			string temp;
			getline (datFile, temp);

			stringstream ss(headers);

			int count = 0, index = -1;

			for (; ss >> temp; count++)
			{
				if (temp...)
				{
					index = count;
				}
			}

			double val;
			for (int i = 0; i < index; i++)
			{
				datFile >> val
			}

			vector<double> values;

			while (datFile)
			{
				datFile >> val;
				values.push_back(val);
			}


stringstream ss(headers) doesn't work - what is "headers"?

I want the first column and the column with the data I want, so how do I modify lines 19-22 for this purpose?

I obviously need something more on line 12 where I have temp... - I need to do a search (using regular expressions?) to see if that header contains "eu" or "EU" - Do you know how to do this, or should I post again in the forum?

I must apologize for this post - I'm used to having the Qt infrastructure to help me out, but I don't have it on this computer.

Thanks, again!
Could you maybe post up a few lines of example data file? Would make it easier to explicitly point to parts of it.
Sure!:

%b02_a08 b02_a08_counts b02_a08_eu
-0.998350 16 2.252804
-0.998292 16 2.252804
-0.998233 16 2.252804
-0.992525 16 2.252804
-0.992467 16 2.252804
-0.992408 16 2.252804
-0.986825 16 2.252804
-0.986767 16 2.252804

This is the one with three columns. I want the first column and the last one (the one that contains "eu".

%b10_a43 b10_a43_gps b10_a43_counts b10_a43_eu
-59.990000 14340000.000000 0 0.000000
-59.979700 14340100.000000 0 0.000000
-59.969500 14340100.000000 0 0.000000
-59.959300 14340100.000000 0 0.000000
-59.949000 14340100.000000 0 0.000000
-59.938800 14340100.000000 0 0.000000
-59.928500 14340100.000000 0 0.000000
-59.918300 14340100.000000 0 0.000000
-59.908100 14340100.000000 0 0.000000
-59.897800 14340100.000000 0 0.000000
-59.887600 14340100.000000 0 0.000000
-59.877300 14340200.000000 0 0.000000
-59.867100 14340200.000000 0 0.000000

This one has four columns, again, I want the first column and the stuff with "eu".
Also, since I am loading multiple data files, I will need multiple vectors (not arrays, as this topic states...). Here is my code so far:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
	string name;
	
	fstream inFile;
	fstream datFile;
	inFile.open ("input.txt");

	if (!inFile)
	{
		cerr << "File open failure\n";
		exit(EXIT_FAILURE);
	}

	while (!inFile.eof())
	{
		getline (inFile, name);

		datFile.open(name);

		if (!datFile)
		{
			cerr << "File open failure\n";
			exit(EXIT_FAILURE);
		}

		while (!datFile.eof())
		{
			string temp;
			getline (datFile, temp);

			stringstream ss(headers);

			int count = 0, index = -1;

			for (; ss >> temp; count++)
			{
				if (temp...)
				{
					index = count;
				}
			}

			double val;
			for (int i = 0; i < index; i++)
			{
				datFile >> val;
			}

			vector<double> values;

			while (datFile)
			{
				datFile >> val;
				values.push_back(val);
			}

	}


See above notes for questions about "temp..." and "datFile >> val".

I want to have multiple vectors (one for each file). The above code will loop over all the "datFile" names listed in "inFile.txt", and create a vector for each one.

Thanks!
what is "headers"?
Sorry. That was supposed to be temp.

You have way too many if (!inFile) and similar checks in your code. It might be a good idea to check once that the file was opened successfully and you really need the one I had in my code, but there is no point in having all of the rest.
Edit: didn't notice you had two files.. Line 25 is still not needed though.

About if(/*temp is the string you wanted*/), I assumed you had the regex part figured out. Well, you could do as you said and get yourself a regex library like Boost.Regex. Though I think with C++11 there is a standard one..
Or you could simply use http://cplusplus.com/reference/string/string/find/ which would be sufficient for your current simple needs.

About the whole thing, if you don't need just one line, I suggest you read all of the data and then pick out what you want. That way it will be less of a mess. You'll need a vector of strings for headers and a vector of vectors of doubles for the columns. Ask if you can't figure out how that would work.
Last edited on
hamsterman:

I'm going to be going through the data, doing some interpolation, and eventually generating +/- 3-sigmas for all the data points. I'll play with C++ vectors and see what I come up with. If I can't solve it, I'll post back.

Thanks.
Topic archived. No new replies allowed.