opening text files as binary files

Jan 20, 2010 at 8:29am
say I have a text file (assuming it's stored in ASCII encoding which is 1 byte).
Can I open it as a binary file and when I read each byte, I cast it into a char? Or read a chunk of bytes and cast it into a c-string or string object?
reason I want to do this is to take advantage of the random access property of binary files
Jan 20, 2010 at 12:05pm
What does prevent you from accessing text file randomly ?
Last edited on Jan 20, 2010 at 1:29pm
Jan 20, 2010 at 3:08pm
C++ doesn't play fair with random access on "text" files because of newline translations.

If you must random acces you text files, use binary mode.

Of course, now you must be careful to handle the newlines yourself. The common newline sequences are

  CR LF (windows)
  LF (unix & linux & mac osx)
  CR (macintosh)

Hope this helps.
Jan 21, 2010 at 1:39am
so it is possible then to do like I said in my first post?
Jan 21, 2010 at 1:48am
@Duoas,
Are CR and LF Carriage Return and LineFeed (which would be '\r' and '\n' respectively)?

I never got why windows and DOS used both...

@unregistered,
I think it's safe to assume that Duoas was saying you can use random access on text files if you open them in binary mode; but you'll have to handle different forms of newline yourself, because different operating systems format text files in different ways, like Duoas said.

You could probably stand to do that with the preprocessor:
1
2
3
4
5
6
7
#if defined _WIN32
// use CR LF
#elif defined POSIX
// use LF
#else
// use CR
#endif 
Jan 21, 2010 at 1:48am
yes it is possible, note what Duoas posted about the newline, different OS use different end of the line pattern. or whatever that means.
Jan 21, 2010 at 1:52am
What it means is that if you save a text file on windows, which (with a text editor that has visible EOLs enabled) might look like this:
1
2
3
This is some text, look, newline: CR LF
This is the second line. CR LF
This is the third. EOF


In POSIX formatting would be saved as
1
2
3
This is some text, look, newline: LF
This is the second line. LF
This is the third. EOF


So if you opened the first in a UNIX text editor that didn't know to open it as a windows-formatted file, you might get something like this
This is some text, look, newline: CR
This is the second line. CR
This is the third.

(note, the CR is visible even though visible EOLs are disabled).

At least, that's what I infer from Duoas' post and from what I knew beforehand...
Jan 21, 2010 at 1:58am
@chrisname
LOL, if your post if for me.. i'm not asking what it means.. i'm just expressing myself, like whatever that means...

anyway you have a good post for me..
Last edited on Jan 21, 2010 at 1:59am
Jan 21, 2010 at 2:18am
Oh, ok, sorry :)
Topic archived. No new replies allowed.