Reading huge binary files

Aug 26, 2011 at 12:25pm
Hello guys :),

I have a binary file whose size is 80 GB. I need to read it in chunks, but as I know, the pointer of fstream is of type long int, which is definitely not enough for pointing to any point after 20 GB.

I used to use memory-mapped files. But I'm already extremely sick of it. The same code just poses different problems on every operating system:

Ubuntu: Unable to map, simply the file can't be opened at some point.
Mint: Unable to map, same Ubuntu's problem.
Fedora: Segmentation fault that doesn't make sense at all, debugging lead to dead-end.
Cent OS: Works fine, but I can't install it on all computers I have.

I posted the problem under this link:

http://www.cplusplus.com/forum/general/48971/

and I achieved the result, that it's IMPOSSIBLE to have a stable program with memory-mapped files when dealing with huge files...

So now I'm looking for an alternative that supports 64-bit integers to point in file?

Thank you for any efforts :)
Aug 26, 2011 at 1:01pm
So, wait. You're using 32-bit operating systems, but trying to map the whole 80 GB file into memory? That won't work, as 4 GB is the absolute maximum addressable with 32 bits.
streamoff is 64 bits, so you shouldn't have any trouble using fstream with files of that size.
Aug 26, 2011 at 1:06pm
Thank you for your answer!

All my operating systems are 64.

So are you saying that fstream has 64-bit pointers for fstream::seekg()? I would be surprised! because the other day I had to use QFile to access a 30 GB file on a 64-bit system, because fstream pointers were short to fit that file size. But now I can't use Qt in my program.
Aug 26, 2011 at 1:14pm
So are you saying that fstream has 64-bit pointers for fstream::seekg()?

Yes. The type used is streamoff, which is generally a typedef for a 64-bit integral type (usually long or long long).
Print the value of sizeof(streamoff) and see what you get.
Aug 26, 2011 at 1:25pm
Surprise! Look at what I got for this program:

1
2
3
4
5
6
7
8
9
10
#include <cstdlib>
#include <iostream>
#include <fstream>
int main(int argc, char** argv)
{
    cout<<sizeof(streampos)<<endl;
    cout<<sizeof(streamsize)<<endl;
    cout<<sizeof(streamoff)<<endl;
    return 0;
}


The output is:

16
8
8

on a Cent OS 64-bit machine :S!!!!!!!!!!!!
Aug 26, 2011 at 1:27pm
streampos is actually a structure with some other member (besides the offset), that's why it's 128 bit.
Aug 26, 2011 at 1:32pm
OK Thank you. I'll try using fstreams and get back to you :)
Topic archived. No new replies allowed.