Buffering...

Reading xander333's XOR Encryption code, I noticed I/O is performed byte by byte.
So I wonder: isn't there a performance penalty for doing it this way, as opposed to using buffers?

I can imagine that as SSD technology gains ground, the simple way will become the better way, because then the seek times would be much lower... to the point where working at byte level doesn't waste time, maybe?

Or am I naive, thinking that xander333's code actually does as executable what it reads as source?

Disch (13742)

I/O is typically buffered in the API or lower (OS?). Manually buffering typically isn't necessary.

Though it would probably help if you gave a link or something so I knew what program you were talking about.

Last edited on

chrisname (7395)

http://cplusplus.com/articles/36A0RXSz/

xander333 (641)

Or am I naive, thinking that xander333's code actually does as executable what it reads as source?

I don't understand that line?

Also, I have another version which reads blocks of data. I haven't tested the speed differences yet but I could do that tomorrow if you'd like. I decided to keep it as simple as possible for the source code section, thus doing it byte by byte. By buffer you mean reading in blocks right? or extracting it with the >> operator?

Last edited on

Catfish (666)

I don't understand that line?

What Disch said... if the I/O isn't buffered anyway by OS or after optimization.
Please do test for speed, then post the block reading code... and the test code.

helios (17607)

There is the extra overhead of a function call for each byte processed, regardless of whether the I/O overhead is there or not.

felixmurphy (1)

great

Cubbi (4774)

Besides OS buffering, the fstream object maintains a buffer too (technically, it holds a pointer to filebuf, which maintains the buffer)

Even though the program inputs individual bytes with ifstream::get() and output individual bytes with ofstream::operator<<, the data is passed between the program and the OS in chunks of 8k, 4k, or whatever your C++ library authors decided is best. You can observe that on Linux with the strace utility.

Compared to disk I/O, function call overhead is usually negligible, but if profiling shows that it is a problem, I would at least use streambuf iterators to avoid the complexity of stream I/O, although block I/O would certainly get rid of even more function calls. And then one could step outside the standard C++ and use memory-mapped file I/O (available in boost).

PS: @xander333, std::ifstream out? Shouldn't that be std::ofstream?

Last edited on

chrisname (7395)

+1 Cubbi

Although I would still recommend fixing the function calling.

Athar (4466)

The overhead introduced by reading and writing each character separately is fairly high.
With that approach, I get a speed of about 19 MB/s. With the previous block-based approach (buffer size reduced to 1 MB), I get ~262 MB/s. And with the following modification, which allows the compiler to XOR 16 bytes at a time using SSE2 instead of operating on single bytes, I get 658 MB/s:

  const int keyBufSize=64;
  char keyBuf[keyBufSize];
  for (int i=0;i<keyBufSize;i++)keyBuf[i]=key[i%key.size()];
  [...]

  int j=0;
  for(;j<blocksize-blocksize%keyBufSize;j+=keyBufSize)
  {
    for (int k=0;k<keyBufSize;k++)buffer[j+k]^=keyBuf[k];    
  }
  for(;j<blocksize;j++)buffer[j]^=key[j%key.size()];

Edit: it should be mentioned that the buffer-based approach requires choosing a block size that is a multiple of the key length. Either that, or avoid resetting the key index to 0 inside the loop.

Last edited on

Cubbi (4774)

I love benchmarking too!

testing on GNU g++ 4.6.2 on linux with a 386,233,996 byte file and a 32-byte key.

original version (sans ifstream typo): http://ideone.com/RNEVA
25.73s

original version modified to use istreambuf_iterators: http://ideone.com/z98Nv
9.75s

Athar's loop (if I implemented it correctly): http://ideone.com/0tkAg
8.10s

memory-mapped file I/O: http://ideone.com/KLxv8
6.21s

the output files were tested in each case to match the output of the original version exactly. Program execution was measured with the bash builtin command "time", average of three tries.

(so I was wrong, in this case, file I/O does not actually dwarf the extra function call times)

Last edited on

Topic archived. No new replies allowed.