Reading and Writing at the same Time

I am trying to encrypt a big file (>3GB).
Because of the size of this file I wanted to read and write byte for byte instead of first reading the hole file and afterwards writing it down again. (Is there a better alternative?)
Therefore I tried to read a byte, encrypt it and write it down again. Somthing like:
1
2
3
4
5
6
7
8
9
10
11
extern unsigned char encrypt(int);

int main(int argc, char *argv[]){
 int c;
 FILE *f = fopen("myfile", "r+");
 while((c = getc(f)) != EOF){
  *(f->_ptr - 1) = encrypt(c);
 }
 fclose(f);
 return 0;
}


This unfortunately does not work and I wanted to ask you how I could make it working properly with such big files.
Thank you ver much for each of your answers.
Last edited on
It doesn't work because you are trying to play with the FILE*'s internals, which you should not do.

Use fseek() to adjust the file position.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <stdio.h>

int main()
  {
  int c;
  FILE* f = fopen( "myfile", "r+b" );  // don't forget to open in BINARY mode 
  while ((c = getc( f )) != EOF)
    {
    fseek( f, -1, SEEK_CUR );
    c = encrypt( c );
    fputc( c, f );
    }
  fclose( f );
  return 0;
  }


Hope this helps.
Yeah, that's what I usually would do (use fseek), but fseek doesn't work for such big files.
Last edited on
Hmm, I didn't realize it would do that. Sorry.

Use fgetpos() and fsetpos(), which are designed for use with huge files.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

int main()
  {
  int   c;
  fpos_t pos;
  FILE*  f = fopen( "myfile", "r+b" );
  while (true)
    {
    fgetpos( f, &pos );
    c = fgetc( f );
    if (c == EOF) break;
    c = encrypt( c );
    fsetpos( f, &pos );
    fputc( c, f );
    }
  fclose( f );
  return 0;
  }

That should definitely work...
Unfortunately that does not work either.
fputc and fgetc set/require their own flag and not just the flag is different, something else (including _cnt and ptr) is changed by this functions, but I don't know what. If I knew what I could manipulate FILE directly.
An example which works for small functions, because fseek seams to reset all methods needed by getc/putc:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>

int main(){
  int   c;
  fpos_t pos;
  FILE*  f = fopen( "myfile", "r+b");
  if(!f){
   return 1;
  }
  while((c = fgetc(f)) != EOF){
    fseek(f, -1, SEEK_CUR);
    fputc(c + 1, f);
    fseek(f, 0, SEEK_CUR);
  }
  fclose( f );
  return 0;
}
Last edited on
I don't know what else to say. fgetpos() and fsetpos() are specifically designed for this. If you can't get it to work then there is something wrong with your standard library.

What version of the standard library are you using (or what brand and version of compiler are you using)?

Don't dink with the internals of a FILE structure. Playing with it doesn't modify your file in any way --but the next time a standard file function uses it it could cause damage to your file.

If you are on nix you can look and see if you have fseek64()... but that is non-standard.
He could do fgetc() until a buffer is filled, process that buffer, write the results, then repeat until the entire file has been processed. Of course, this method won't work on binary files (with the EOF character and all that).
On Windows, you have SetFilePointerEx() (http://msdn.microsoft.com/en-us/library/aa365542(VS.85).aspx). Of course, it is also non-standard. And you have to deal with WINAPI's retarded structures.
Wouldn't it be possible to in-line an ASM statement that returns one bit after the other, encrypt it, and pass it back using another ASM statement?
I really would like to use something standard, not just something for Windows, but thanks.

@Duoas I use Dev C++ Version 4.9.9.2, but in Microsoft Visual C++ 2008 Express Edition it also doesn't work.

@helios how exactly do you mean that?

@toshiro Maybe you know an example or something likte that?
al has to be 02h to read and write in a file and ah has to be 3dh to open and 3eh to close, right? How can I write afterwards? I only know how to read a file with assembly more or less.
Well, I'm sorry. The standard only uses ints to pass numerical data, so you'll never be able to use a number larger than +2^31-1. A coworker once told me about Boost while we were talking precisely about this, but so far I haven't found the way to do it.

How do I mean what?
He could do fgetc() until a buffer is filled, process that buffer, write the results, then repeat until the entire file has been processed.
This is what I mean:
1
2
3
4
5
6
7
8
9
10
11
12
#define BUFFER_SIZE 4*1024*1024
char *buffer[BUFFER_SIZE];
while (1){
    long a;
    for (a=0;a<BUFFER_SIZE && (c=fgetc(f))!=EOF){
        buffer[a]=c;
    }
    //'a' now contains how many bytes were read.
    processBuffer(buffer,a); //I suppose the function would need the output file or something.
    if (c==EOF)
        break;
}

Note that this approach, like I said before, doesn't work for binary files, as they can contain data well after a byte with 0xFF. So you could only do this for the encryption routine. Another solution I guess would be to design the algorithm so that 0xFF is an impossible byte. This is, of course, a rather huge workaround to avoid using OS calls.
Last edited on
Thanks a lot for your help!

I am using now something similiar to your code helios. I create a new file with an unique name, read the old file and write in the new, delete the old file and rename the new.

But I would be happy if anyone knew a better possibility without to double the memory on the harddisk. In the meanwhile I try to find something in assembly.
Wait, what?
You're using a file as a buffer? The buffer is supposed to be in memory. That thing must be slow as hell!

Oh, I see...
By "unique name" I assume you mean you name it something like "aoigybailuygvboasiegviowuyvbgpiuwbsg.qhahf". It would make more sense to ask the user for a name and keep the old file.
If you're going to use Assembly, you might as well use system calls. That thing is not going to be portable.
Last edited on
If you could explain me how I can use processBuffer without to close and to reopen the file it wouldn't be necessary.
The Programm is meant to encrypt files and folders automatically and the user shouldn't be asked every time to rename his files.
Oh, I thought you wanted to encrypt just one file per execution. If that's not the case, your approach makes sense.
Note: In my previous post, I wrote 'bit', but of course I meant 'byte'.

jmc: I had better not post any ASM code, because my experience thus far is limited to 80186 processors, and no file interaction has taken place. I would try to implement it, but I don't have a testbed system where I am now, and I don't know how portable code tested on the 8086 emulator would be.
Last edited on
Topic archived. No new replies allowed.