I have a 243 gigabyte text file that I'm trying to break up into smaller pieces. Specifically, it's a textfile with a lot of numbers in it. I want to break the file into smaller pieces so that each smaller piece has 1328098 numbers in it. I tried this code which went at a decent speed:
The issue was that the code would copy the first 1328098 numbers to a text file called row_0 and then would continue counting the how many numbers passed through and would create the subsequent rows (i.e. row_1.txt, row_2.txt), but the code WOULD NOT output the numbers from the input file to the output file. So I changed the code to look like this:
When I tested this, the thing which slowed it down was this cout at line 29:
cout << numbers <<'\n';
The problem is line 40
ofstream values2 (file.c_str());
Here values2 is a completely separate fstream, unrelated to the other fstream on line 39. When the closing brace at line 43 is reached, that new fstream goes out of scope and is destroyed at that point. Hence nothing can ever be written to it.
Instead of declaring a new object, just re-open the existing stream.
My version of more or less the same code looks like this:
When I've run into similar problems in the past, I've found that increasing the size of the write buffer has helped. To do this in C++, I think you use a std::filebuf instead of ofstream, and call pubsetbuf() to assign a large buffer. As a starting point, try 1MB.
Thank you both so much! That helped a ton. Now I'm understanding input/output streams much better! Although, I'm still not completely understanding why the numbers were not being outputted to the file specified in the first code. Was it because I neglected to reopen values2?