WriteFile vs fwrite in synchronous I/O

Forum

Forum
Windows Programming
WriteFile vs fwrite in synchronous I/O

WriteFile vs fwrite in synchronous I/O

Aug 16, 2010 at 10:04am

closed account (o23q5Di1)

Hello everybody, I'm writing a program to merge big files previously created by different threads. For example if I have 4 files, they look like this:

File 1: 1, 5, 9, 13...

File 2: 2, 6, 10, 14...

File 3: 3, 7, 11, 15...

File 4: 4, 8, 12, 16...

And of course the output file is expected to be: 1, 2, 3, 4, etc.

To do this, I have 4 threads, each reading from one of the 4 files and writing to a buffer. When the buffer is full, the main thread writes the buffer to the output file.

I've noticed that the WriteFile() function really takes a lot of time. For example, if the write operation is not done, the execution takes 39 seconds. With the same parameters, when I use fwrite() to write on the output file, the whole operation takes 44 seconds, and if I use WriteFile(), as much as 132 seconds! The problem is WriteFile(), but I think that's because I don't make the most of it. Any suggestion?

Also, shoud I try overlapped reading from the input files even if each time I read only 3 bytes?

Thank you so much

Aug 16, 2010 at 10:59am

kbw (9488)

WriteFile is a raw unbuffered write to the OS.
fwrite is the C runtime buffered write.

Overlapped I/O doesn't make the write go any faster, it just allows WriteFile to start the operation. You still need to check at some later time whether the operation completed successfully.

Aug 16, 2010 at 11:26am

closed account (o23q5Di1)

So I should use buffered I/O in this case, if I want I/O operations to be faster. What disadvantages does buffered I/O have?

Aug 16, 2010 at 12:27pm

kbw (9488)

If all you're doing is dealing with large amounts of data to read/write then buffered I/O is what you want. It makes the best use of the I/O system.

You'd use asynchronous I/O timing of your piece was more important than processing large volumes elsewhere on the system.

Aug 16, 2010 at 12:42pm

closed account (o23q5Di1)

Thank you so much, that's what I'm going to do then.

By the way, someone wrote "You want unbuffered output whenever you want to ensure that the output has been written before continuing." [http://stackoverflow.com/questions/1450551/buffered-i-o-vs-unbuffered-io]
Well, I can't really see the point. Isn't that asynchronous output?
Or maybe it means that, since buffer has to be first completely filled, there is a little delay in buffered I/O, but we don't care as long as data are sooner or later written (that is to say, if the programs doesn't crash). In my case, if the program crashes, I don't care how much data have been written, I only care about knowing whether ALL data have been written or not.

Is it wrong? I'm really interested in this buffer I/O thing, you know. Thanks again

Aug 16, 2010 at 1:12pm

kbw (9488)

The comment is correct. The two views are not in disagreement. You force thru your I/O at the expense of other operations.

The idea is that a device has a maximum I/O rate. In the case of a disk, that's a straight sequential read or write without doing seeks. Additionally, DMA allows the disk the perform this read/write from a contiguous region of memory without main CPU interference. Anything that deviates from this pattern of I/O introduces delay.

Buffered I/O collects the data into larger contiguous regions to be read/written. An unbuffered write will force an write on a relatively tiny amount of data, resulting in data being written to the device but interfering with buffered big block reads/writes.

Aug 16, 2010 at 1:44pm

closed account (o23q5Di1)

I've just read about DMA, and it seems to me that the delay is every time the CPU initiates the transfer. So the more I/O operations the CPU has to initiate, the more delay there will be.
Is it wrong? Are there other reasons for the slowness of unbuffered I/O?

So when is it recommended to use unbuffered I/O functions such as WriteFile()? I understand they are slower then buffered ones, but what about the advantages? Are they more immediate, not having to wait for the buffer to be filled and then written/read?

Last edited on Aug 16, 2010 at 1:51pm

Aug 16, 2010 at 1:52pm

kbw (9488)

The machine speed doesn't matter. It's the disk speed that matters as it's slower by huge amounts.

Last edited on Aug 16, 2010 at 10:05pm

Topic archived. No new replies allowed.

C++

Forum

WriteFile vs fwrite in synchronous I/O