Array of ofstream objects

May 14, 2009 at 5:48pm
Hello,

I am having opening a series of ofstream objects. I have created an array:

 
  ofstream *outFile = new ofstream [bin_list.size()+1];


where bin_list is a vector of doubles that contains the bins for which I want to create files. I then open each file in a loop. My problem is that, when I reach a certain number of files, it stops working. My first question is this: is there a limit to the number of streams that can be open at one time? I run into the problem at about 1000.

Thanks for any help you can offer.

Jeremiah
May 14, 2009 at 6:11pm
There is a finite number of file handles/descriptors. You've probably run out.
May 14, 2009 at 6:44pm
Is there any way to find out what that number is, or modify it?

Thanks.
May 14, 2009 at 7:15pm
No. (Well, technically, yes, but...)

You shouldn't ever need to open so many files at once -- keeping an array/vector/whatever of ofstream objects is a red-flag that something is wrong with your design.

What exactly are you trying to do with this? Perhaps we can suggest something better suited to your goal.

[edit] Oh, yeah, I forgot to mention that programs that do things like open a zillion files or copy themselves repeatedly are often flagged as evil by the OS and automatically terminated.
Last edited on May 14, 2009 at 7:16pm
May 14, 2009 at 7:50pm
I am trying to process some data. Basically, I have a file with the the values of pressure and other variables defined for a set of surfaces, and I am breaking the surfaces into sub-zones and calculating the total force. For each sub-zone, I have created a separate file. This is done repeatedly for many time steps.

I have considered writing one file with many columns (corresponding to the different zones), but had avoided this because the force has multiple components, and so each zone would have to have multiple columns, and the file would get confusing. Essentially, I am writing a three-dimensional array, and the way that I solved this was to create a set of files that each contain a two-dimensional array.

I have temporarily avoided this problem by reducing the number of zones. This error was surprising to me because I had previously used this code on a different data set with success.
May 14, 2009 at 8:46pm
Why must the files be opened at the same time? Couldn't you open, read, and close exactly one file per loop?
May 18, 2009 at 3:13pm
I could do this, and maybe that is the solution. The only problem is that I would then have to open and close each file at each time step. I am processing 1000 time steps, and between 200 and 1000 files. Would that many open and close statements take a significant amount of time?

Thanks for the input.
May 18, 2009 at 3:57pm
Hmm, I see...

It would probably cost you less to simply flat-map a single file.

For example, a 2D array is written by simply writing a sequence of 1D arrays:

a[0][0] a[0][1] a[0][2] a[0][3]
a[1][0] a[1][1] a[1][2] a[1][3]
a[2][0] a[2][1] a[2][2] a[2][3]
...

To write a 3D array, simply write a sequence of 2D arrays:

a[0][0][0] a[0][0][1] a[0][0][2] a[0][0][3]
a[0][1][0] a[0][1][1] a[0][1][2] a[0][1][3]
a[0][2][0] a[0][2][1] a[0][2][2] a[0][2][3]

a[1][0][0] a[1][0][1] a[1][0][2] a[1][0][3]
a[1][1][0] a[1][1][1] a[1][1][2] a[1][1][3]
a[1][2][0] a[1][2][1] a[1][2][2] a[1][2][3]

a[2][0][0] a[2][0][1] a[2][0][2] a[2][0][3]
a[2][1][0] a[2][1][1] a[2][1][2] a[2][1][3]
a[2][2][0] a[2][2][1] a[2][2][2] a[2][2][3]

...

(spaces added for clarity)

For a binary file, the offset to each of the third dimensions is a constant scalar value, which you can use to reduce the amount of effort to access corresponding fields... And you wouldn't have to open/close multiple files or maintain a huge list of open files... meaning you should have some good speed improvements.

This method also has the advantage that you can simply concatenate all your existing data files to produce the new data file.

The disadvantage is that it is more costly to resize your data.

I hope this helps...
Last edited on May 18, 2009 at 3:58pm
May 18, 2009 at 4:23pm
I will play around with opening and closing each file as I write data, and also writing to a single file. Thanks for the help.
May 18, 2009 at 4:54pm
There is system limit of number of filedescriptor per program. Default is 1024.

You can check your value with ulimit -n.
ulimit -n -H gives maximum allowed by system (without root privileges).
Topic archived. No new replies allowed.