min/max, mean, median.

Pages: 123
I tried this code, i am able to pull the data in line and able to push it to the vector as well. But the output is repeated, i dont know why is it happening.
Secondly, i am not able to figure out how to iterate the code to do the calculations?

1
2
3
4
5
6
7
8
9
10
11
12
13
while(getline(infile, line))
	{
			std::stringstream   linestream(line);
			std::string         value;
			
			while (getline(linestream, value, ','))
			{
				V1max.push_back(value);                                
				std::cout << "Value(" << value << ")"<< " ";  //just to check the values
			}
			std::cout << "Line Finished\n" << std::endl;
			
	}
What do you mean by repeated? Please show the output of your program, and a complete program.

How are you saving the V1max vector for use in your calculations? As shown V1max, after the loop will hold only the values for the last line.

Qt's QString has method split. http://doc.qt.io/qt-5/qstring.html#split
It is conceptually close to targaryen's while-loop.
Put the other way: the current program should imitate QString::split even more closely than it does now.

Note: std::string has methods find() and substr(). Are they not at least as good as stringstream and getline?


Lets assume that you have std::vector<std::string> that contains values of one column. Each element looks like integer or "?".
1
2
3
4
5
6
7
8
9
10
std::vector<long> toLong( const std::vector<std::string> & array ) {
  std::vector<long> result;
  result.reserve( array.size() );
  for ( const auto & word : array ) {
    if ( word != "?" ) {
      result.push_back( std::stol( word ) );
    }
  }
  return result;
}

A very similar toDouble function would return vector<double>.

Lets assume that you have the entire input data as std::vector<std::vector<std::string>> and all values are of floating type (or ?).
1
2
3
4
5
6
7
for ( const auto & column : table ) {
  auto dcol = toDouble( column );
  for ( auto value : dcol ) {
    std::cout << value << ' ';
  }
  std::cout << '\n';
}

That simply prints each column as a row (and omits missing values). You could calculate the min,max,mean,median for each dcol array within that loop.

(I have already shown how to calculate min,max,mean for an array and you really should be able to find from web the math of median.)
hey guys,

Here is my full code, I was able to calculate min/max for each row, But i want to do it for each column.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
#include <iostream>
#include <string>
#include <cstring>
#include <sstream>
#include <iomanip>
#include <fstream>
#include <vector>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <algorithm>

using namespace std;

int main(int argc, char **argv)
{
	string finput, foutput = "output.txt";

	if (argc == 2)
	{
		finput = argv[1];
	}
	else if (argc == 3)
	{
		finput = argv[1];
		foutput = argv[2];
	}
	else
	{
		cerr << "No input file specified!" << endl;
		return 0;
	}


	int response = 0;
	string line;
	vector<string> V1;

	cout << "----------------------------------------------------------------" << endl;
	cout << "1. Find Min/Max of every Column " << endl;
	cout << "2. Find Mean of every Column " << endl;
	cout << "3. Find Medien of every Column " << endl;
	cout << "----------------------------------------------------------------" << endl;
	cin >> response;

	cout << "reading " << finput << "..." << endl;
	ifstream infile_txt;
	infile_txt.open(finput.c_str(), ios::in);
	ofstream outfile;
	outfile.open(foutput.c_str(), ios::out);


		if (response == 1)			// for min/max
		{

			while(getline(infile_txt, line))
			{

				std::stringstream   linestream(line);
				std::string         value;

				while (getline(linestream, value, ','))
				{
					std::cout << "Value(" << value << ")" << " ";

					if (value != "?")
					{
						std::cout << "Inserted(" << value << ")" << " ";
						V1.push_back(value);
					}

				}
				std::cout << "Line Finished\n" << std::endl;
				auto biggest = std::max_element(begin(V1), end(V1));
				auto smallest = std::min_element(begin(V1), end(V1));
				std::cout << "Max element is " << *biggest << std::endl;
				std::cout << "Min element is " << *smallest << std::endl;
				outfile << "Max element is: " << *biggest << std::endl;
				outfile << "Min element is: " << *smallest << std::endl;
				V1.clear();
			}

		}

		if (response == 2)			//for Mean calculation
		{

		}

		if (response == 3)			//for Median Calculation
		{

		}
		system("pause");
		cin.get();			// For Keeping the output window open.
		return 0;
}


Input file is:

1
2
3
4
5
6
7
8
5.1,3.5,1.4,0.2,5.1,3.5,1.4,0.2,?,?,?
4.9,3.0,1.4,0.2,4.9,3.0,1.4,0.2,?,?,?
4.7,3.2,1.3,0.2,4.7,3.2,1.3,0.2,?,?,?
4.6,3.1,1.5,0.2,4.6,3.1,1.5,0.2,?,?,?
?,?,?,4.6,3.1,1.5,0.2,4.6,3.1,1.5,0.2
?,?,?,4.7,3.2,1.3,0.2,4.7,3.2,1.3,0.2
?,?,?,4.9,3.0,1.4,0.2,4.9,3.0,1.4,0.2
?,?,?,5.1,3.5,1.4,0.2,5.1,3.5,1.4,0.2


Last edited on
Your program calculates for each row, not for each column.

You calculate min and max for a set of words, not for a set of numbers.

You mix IO with calculations. Thus, you have to repeat the code of reading data to all three cases that the user can choose. Why?
hi Keskiverto,

the original data set is:

1
2
3
4
7.84626,0.00121498,?,?,?,?,0.595974,1722821,40877036,73,601,130,45,73,1481,1479,2153,4922.35,116792,6.13849,7.86395,0.00847591,0.00737798
6.18782,0.000136137,?,?,?,?,0.595974,1722821,40877036,73,601,130,45,73,1481,1479,2153,4922.35,116792,6.13849,7.86395,0.00847591
7.86844,0.00187588,?,?,?,?,0.595974,1722821,40877036,73,601,130,45,73,1481,1479,2153,4922.35,116792,6.13849,7.86395,0.00847591,0.00737798
6.12701,0.0010252,?,?,?,?,0.595974,1722821,40877036,73,601,130,45,73,1481,1479,2153,4922.35,116792,6.13849,7.86395,0.00847591,0.00737798


I explained a simpler dataset in earlier post to explain the problem better. So following that post, i made this program. I am a beginner with c++ programming, so i dont know how to do the task.
I am trying to explain the problem and trying to cut it in shorter and simpler cases which can be later merged to get the final solution.
I know one time scanning of the data can do calculations for all three(min/max, mean, median) but i am going slow...one by one.
Last edited on
IMO since you have invalid entries this assignment will be easier if you first read your file into a vector<vector<string> keeping the question marks in the vector. This would make your file reading more like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
    std::vector<std::vector<std::string>> data;
    std::string line;
    while(getline(infile_txt, line))
    {
        std::vector<std::string> lines;
        std::stringstream   linestream(line);
        std::string         value;

        while (getline(linestream, value, ','))
        { // Parse the line.
            lines.push_back(value);
        }
        data.push_back(lines);
    }


    std::cout << "Print the data as strings" << std::endl;
    for(auto itr : data)
    {
        for(auto i : itr)
            std::cout << i << " ";
        std::cout << std::endl;
    }



Then the next step will be to swap the rows and columns. This is a little more complicated but something like this should work:

1
2
3
4
5
6
7
8
9
10
11
12
13
    std::vector<std::vector<double>> dataAsNumbers;

    for(size_t i = 0; i < data[0].size(); ++i)
    {
        std::vector<double> vdouble;
        for(size_t j = 0; j < data.size(); ++j)
        {
            if(data[j][i] != "?") 
                vdouble.push_back(stod(data[j][i]));
        }

        dataAsNumbers.push_back(vdouble);
    }


You now should have a vector<vector<double>> that contains the column information in row order looking something like:

1
2
3
4
5
6
7
8
9
10
11
5.10 4.90 4.70 4.60 
3.50 3.00 3.20 3.10 
1.40 1.40 1.30 1.50 
0.20 0.20 0.20 0.20 4.60 4.70 4.90 5.10 
5.10 4.90 4.70 4.60 3.10 3.20 3.00 3.50 
3.50 3.00 3.20 3.10 1.50 1.30 1.40 1.40 
1.40 1.40 1.30 1.50 0.20 0.20 0.20 0.20 
0.20 0.20 0.20 0.20 4.60 4.70 4.90 5.10 
3.10 3.20 3.00 3.50 
1.50 1.30 1.40 1.40 
0.20 0.20 0.20 0.20 


Now you can use this vector to easily compute the desired values.



thanks Jlb for your sincere effort!
How to check this code?
I tried pasting right after the code you sent first, but it gives an error. "Debug assertion failed!"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::vector<std::vector<double>> dataAsNumbers;

    for(size_t i = 0; i < data[0].size(); ++i)
    {
        std::vector<double> vdouble;
        for(size_t j = 0; j < data.size(); ++j)
        {
            if(data[j][i] != "?") 
                vdouble.push_back(stod(data[j][i]));
        }

        dataAsNumbers.push_back(vdouble);
    }
Post your code! Without your code how can I see why your program is failing?

I have inserted your code, something like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
int main(int argc, char **argv)
{
	string finput, foutput = "output.txt";

	if (argc == 2)
	{
		finput = argv[1];
	}
	else if (argc == 3)
	{
		finput = argv[1];
		foutput = argv[2];
	}
	else
	{
		cerr << "No input file specified!" << endl;
		return 0;
	}


	int response = 0;
	string line;
	vector<string> V1, V2;

	cout << "----------------------------------------------------------------" << endl;
	cout << "1. Find Min/Max of every Column " << endl;
	cout << "2. Find Mean of every Column " << endl;
	cout << "3. Find Medien of every Column " << endl;
	cout << "----------------------------------------------------------------" << endl;
	cin >> response;

	cout << "reading " << finput << "..." << endl;
	ifstream infile_txt;
	infile_txt.open(finput.c_str(), ios::in);
	ofstream outfile;
	outfile.open(foutput.c_str(), ios::out);


		if (response == 1)			// for min/max
		{//-----------------------------------------------------------------------------------
			std::vector<std::vector<std::string>> data;
			std::string line;
			while (getline(infile_txt, line))
			{
				std::vector<std::string> lines;
				std::stringstream   linestream(line);
				std::string         value;

				while (getline(linestream, value, ','))
				{ // Parse the line.
					lines.push_back(value);
				}
				data.push_back(lines);
			}


			std::cout << "Print the data as strings" << std::endl;
			for (auto itr : data)
			{
				for (auto i : itr)
					std::cout << i << " ";
				std::cout << std::endl;
			}
			
			std::vector<std::vector<double>> dataAsNumbers;

			for (size_t i = 0; i < data[0].size(); ++i)
			{
				std::vector<double> vdouble;
				for (size_t j = 0; j < data.size(); ++j)
				{
					if (data[j][i] != "?")
						vdouble.push_back(stod(data[j][i]));
				}

				dataAsNumbers.push_back(vdouble);
			}
			
              //----------------------------------------------------------------------------------

		}

		if (response == 2)			//for Mean calculation
		{

		

		}

		if (response == 3)			//for Median Calculation
		{

		}
		system("pause");
		cin.get();			// For Keeping the output window open.
		return 0;
}
Last edited on
The debug assertion is a runtime error, isn't it? What is the last printout before the crash?
the program displays the whole input data(as shown below) without comma, i guess, prints it as strings and then gives the erroe message.


1
2
3
4
5
6
7
8
5.1 3.5 1.4 0.2 5.1 3.5 1.4 0.2 ? ? ?
4.9 3.0 1.4 0.2 4.9 3.0 1.4 0.2 ? ? ?
4.7 3.2 1.3 0.2 4.7 3.2 1.3 0.2 ? ? ?
4.6 3.1 1.5 0.2 4.6 3.1 1.5 0.2 ? ? ?
? ? ? 4.6 3.1 1.5 0.2 4.6 3.1 1.5 0.2
? ? ? 4.7 3.2 1.3 0.2 4.7 3.2 1.3 0.2
? ? ? 4.9 3.0 1.4 0.2 4.9 3.0 1.4 0.2
? ? ? 5.1 3.5 1.4 0.2 5.1 3.5 1.4 0.2

I recommend running it with a debugger, so that you can see exactly which line the assertion fails on, and what the state of the memory is at that point.
The code gives error on the line below, when the value of j = 8.

 
 if(data[j][i] != "?")
Your printed data shows only 8 rows of values, but apparently vector 'data' has at least 9 elements That 9th element, data[8], must thus be an empty vector. and attempt to access non-existent elements does crash.

Perhaps an empty line in the end of the input file?

Add condition to line 53. Do not append empty 'lines' to the 'data'.

Actually, it would be wise to check on line 55 that all elements of 'data' have same size.

Similarly, you don't want to enter loop on line 67, if the 'data' is empty.
yes, it was space problem.

i checked in the input file and removed the spaces, it worked!

I thank you all for your sincere efforts and kind assistance!!
i checked in the input file and removed the spaces, it worked!

This time. The next input file could have the same issue. You should improve your program so that it does not crash due to such simple input problem.
I added small code to remove spaces

1
2
3
4
5
void RemoveSpaces(string &input)
{
	delChar(input, '\n');
	delChar(input, '\r');
}
guys, further question.. now i have a vector<vector<double>>. How do you suggest i should calculate min/max , mean and median?

i am not able to figure how should i proceed now? like using this 2D vector and then proceed row by row... with another vector?
or
storing the 2D vector in another file, then accessing it via code to do the calculations...?
Last edited on
Why would you need another vector?

Why do you think you need another file?

The vector you now has been converted from the row major order to column major order. So now each "inner" vector contains all the information for one column. So for mean just total that vector and divide by the size of the vector.


Pages: 123