File Read/Write Operations

I have no idea what a good practice would be for performing read / write operations on large data files (nothing larger than say 100 mb).

I'm looking more for what a good strategy would be to read and write info on this scale.

Two strategies come to my mind but I'm sure there are more ways to handle this situation.

Should I:

step 1: Open two files simultaneously using ifstream and ofstream
step 2: read data from input file with getline()
step 3: parse data
step 4: write data to output file
step 5: loop thru step 2 until .eof()
step 6: close both files when completed and end program

Or Should I:

step 1: Open only the input file with ifstream
step 2: read data from input file with getline()
step 3: parse the data then write to an array with delimiter of some sort
step 4: close the ifstream and open the output file with ofstream
step 5: write the array with the parsed data to the output file.
step 6: close the output file and end program.



I'm not sure if one strategy is easier or works best. I attempted to do it with the first strategy a little while ago and it was working fine until I got to step 4.

I am new at this though so maybe a fresh mind would help me finishing it tomorrow. I'm just not sure I'm going at this problem the best / easiest way and if anybody could offer some advice I'd be grateful.

Good night!
I suppose a good strategy for large files is to load a rather "large" chunk into memory and operate on that, and then store that large chunk. I think this can be good for large files because I think the HDD is more efficient when used in this "burst" mode, instead of a near-constant stream of requests. Such a stream will probably make the processor wait for the HDD; and the intervals between requests to the HDD might make it slow for other tasks.
Consider your environment, if you're working on a modern PC the large chunk approach may be fine, but if you have an embedded board with limited memory, then consider using a smaller buffer, to avoid swapping (irony).

What went wrong at step 4?

The Linux cat command sorts of works like your first scenario, perhaps looking at the source code would help:

http://www.scs.stanford.edu/histar/src/pkg/cat/cat.c
Thanks for the replies.

The answer would seem to be: "It depends on what equipment you are using and how much memory you have."

I'm not sure why the first scenario wasn't working for me. I still have more reading to do on this subject.

I would like to learn how to do it both ways just for my own knowledge.

I got the second method to work so it must be something to do with the way i'm passing filestreams around or something else I haven't considered. (I really just started reading about passing filestreams about yesterday too)

When Bourgond brought up the point that it takes a long time to read / write to the hard drive it does make sense to read as large a chunk of the file into memory then work on it. I can understand that "burst" read / write concept.

There might be a way to split up the file even if it was huge and burst read / write a few sections of the code at a time but I think i'll try to get it working fully both of the ways listed above before I try to modify it for a segmented burst. I think I read about some way to keep track of where you last read from a file by counting bytes but that is a problem for later.

Thanks for the link tipaye, I'll take a look at that and try to understand how it works.
Topic archived. No new replies allowed.