hi every I received a bug when I reading a file.
I want to read a file, which more than (1 << 17) lines.
The file is somehow like that :
1|2|3|4|....
1|2|3|4|....
1|2|3|4|....
1|2|3|4|....
From that I only want the first 4 columns, so read SIZE of lines then save the first 4 columns into 4 vectors, which contained in a vector of vector.
The problem is that when I use SIZE = 1 << 16, it works
But when SIZE becomes to 1 << 17 or greater, then I got Address boundary error, which 100% is the fault an vector. Segmentation fault (core dumped)
Do you find the error?
do you have enough free, contiguous ram locations to store that many push-backs?
I suspect its not the FILE, but that your vector has run out of room.
to test this, eliminate the file entirely, and push back that many records (just push back the same test data record hard-coded that many times) to see if the vector is actually the issue.
remember that vectors require the memory to all be one solid block... like arrays...
This file has more than 1 << 17 . Even more than 1 << 20 lines. So the SIZE I chose, is within the range of file.
That statement makes no sense. If the files has more than 1<<20 (1,048,576) lines that's not within the range of SIZE. That really doesn't make any difference though since std::vector will continue to resize each vector as you push more than 1<<17 (65536) rows.
I hope you realize you're trying to reserve more than 2MB of memory (65536 * 4 * 8).
And if you're trying to read 1,000,000 rows, you're going to need more than 32MB of memory.
EDIT: Keep in mind that each vector requires overhead for every nested vector. This is typically 16 bytes. I didn't include this in my calculations above. For 1,000,000 rows, this comes out to more than 100MB.
Here's a cleaned up version of your program which requires significantly less memory by using a struct instead of a nested vector. It also avoids reading more columns than necessary when reading each row.
Then why bother with creating strings in your subprogram. Just read the entire line with getline() then use a stringstream to parse the line into the proper type of variable.
Something like the following (not tested):
1 2 3 4 5 6 7 8 9 10 11 12 13
std::vector<uint64_t>get_values(std::string strToSplit, int num_itmes)
{
std::stringstream ss(strToSplit);
std::vector<uint64_t> items;
uint64_t value;
char delimiter;
for(int counter = 0; counter < num_items; ++counter)
{
if(ss >> value >> delimiter) // Make sure conversion successful.
items.push_back(value); // Using push_back() so it is easy to tell if all values were successfully read.
}
return items;
}
Note the calling function should check if the returned vector has the proper number of values, if not then the line had some bad data.
Also unless you're happy with the program "crashing" if the call to stoX() (in your original code) fails you should consider using a try/catch block to handle the failure.