what happens when EOF occurs but there is more?

Pages: 12
I am using Windows so EOF would be <ctrl-d> right?
what is those ascii characters occured in a file halfway through it and how do I avoid it if it truely ends the file?
And how do you code EOF for this and another OS so that the files produced would be the exact same file when it is written?
Last edited on
on win it is ^Z
(^d is *nix)
by def eof=no more data
There are functions for checking whether you've reached EOF on a file.

Using streams:

1
2
3
4
5
6
ifstream fstr;
...
while (!fstr.eof())
{
  // read
}


Using c-style FILE*:

1
2
3
4
5
6
FILE* fptr;
...
while (!feof(fptr))
{
  // read
}


Cheers,
Jim
Thanks for the replies.
So how do I avoid EOF when working but keeping all the other characters?
Also I assume that a .mkv file will work on Linux and Mac OSX I know it does in windows but how do these systems know when EOF is reached if it is different on every OS?
If you use the functions to tell you when EOF is reached, rather than checking the bytes for a value of EOF yourself, you'll be right.

The functions know the difference between EOF on different operating systems.
I guess the real questions I need to ask are :
if I write it on a windows PC with Code:Blocks an then recompile it for Linux and Mac will the same EOF be used as the Win PC?
If not then how would I specify in my source code to make the windows EOF when compiled for a diffent OS so that all the files generated would be exaclty the same? There must be some way because some files I dl are actually for Linux but Windows knows where they end.
Last edited on
At an OS level, if you're creating the data files you don't need to worry about EOF - just closing the output stream will do the job for you.

If you're reading the files, you don't have to worry about the EOF character - just use the stream::eof() or feof() functions.

If it's a specific protocol/data format within the files (say an mkv chapter marker), that won't change depending on which operating system the file's sitting on.

Cheers,
Jim
what is those ascii characters occured in a file

Nothing special happens in this case.


There are three similarly named concepts here:

1. EOF the condition of the input stream

2. EOF the C macro

3. EOF the ASCII control character

The input stream condition "EOF" is triggered by reaching the end of the input file on an input file stream, reaching the end of the underlying string on an input string stream, on receiving unescaped Ctrl-D (unix)/Ctrl-Z (windows) from the user on the standard input stream.

EOF the C macro is the special non-character integer value that is returned by C library input functions when the underlying stream has encountered the EOF condition

EOF the ASCII control character, such as if found in a file, has no special meaning terminates text mode file input in CP/M, MS-DOS, and Windows, but has no special meaning in Linux, Solaris, and other UNIX-like systems.
Last edited on
So if I use the ASCII symbol EOF in my files it will not end them untill there is no more data in the file and the real EOF is reached?
Also how would I be able to compile the same EOF regardless of OS -like have both the linux version have the exact same EOF as the windows had in my files?
For our level of interaction, the EOF that signals the condition that the end of the file has been reached is NOT stored in the file - it's a condition that's signalled by the operating system, that the C-library functions will see, and then tell you via the eof()/feof() functions, or by returning -1 (EOF macro) from getc(), for example.

The library functions for Windows know how to interpret a Windows end-of-file signal. The library functions for Linux know how to interpret a Linux end-of-file signal. When you compile, you use the library for the operating system you're using, so you don't have to worry about it.

Jim
What I am asking is what it writes as EOF. I want a file that is generated on a WinPC be interchangable with a file that is on Linux or OSX so that any system can read it-like an AVI or MKV or MPG.
Thanks
What I am asking is what it writes as EOF.
It depends on how the file is opened when reading. If opened as text, everything after the EOF character will be unreadable. If opened as binary, nothing special will happen.
If opened as text, everything after the EOF character will be unreadable.


Tested.. I stand corrected, Windows does in fact terminate text mode file input if Ctrl-Z (ASCII 26) is encountered. I had no idea backwards compatibility with CP/M is in effect even in 2011.

Linux, Solaris, and LynxOS (other systems I have at hand) read every character in text mode.
It depends on how the file is opened when reading. If opened as text, everything after the EOF character will be unreadable. If opened as binary, nothing special will happen.


That's misleading - it totally depends on the software reading the written file, and if the software's written correctly, using the eof()/feof() functions, it will read straight past the EOF character, stopping at the REAL end-of-file.

For example, use the following code to generate a text file with an EOF character in the middle.

1
2
3
4
5
6
7
8
9
FILE* fptr;

if (fopen_s(&fptr, "c:\\EOFs.txt", "w") == 0)
{
	fputs("This is a file", fptr);
	fputc(EOF, fptr);
	fputs("This is after the EOF", fptr);
	fclose(fptr);
}

Open the text file (I used notepad++) and you'll see the following content.

This is a fileÿThis is after the EOF


Depending on the O/S, the physical DISK may have an end-of-file marker byte, but it's NOT part of the actual file content, and WILL NOT SHOW UP as a read character by software reading the file content.

For binary files (or the example file created by the code above), your fgetc() call may return -1/EOF. If it does, you should then check feof() to see whether it's a byte value read from the file, or whether you really have reached the end of data.

For the file above, you'll get two instances of -1 being returned - the first for the EOF byte that's actually in the file, which when you call feof() on it will return FALSE, and the second for the actual EOF condition, at which point feof() will return TRUE.

Cheers,
Jim
Linux, Solaris, and LynxOS (other systems I have at hand) read every character in text mode.
Oh? I wasn't aware of this. This is why text mode sucks. What were K&R thinking when they made it the default?
To elaborate on helios's response:

All those systems are *nix-based/-derived systems, so there is no difference between text and binary mode when reading files; a file is just a sequence of bytes where convention holds that '\n' is considered a line separator for textual information.

Windows, in contrast to Unix, tended to use both old printer codes: '\r' (carriage return) then '\n' (line feed) together to represent a line separator for textual information.

How does one handle this in a cross-platform manner? Simple. Open a file in "binary" mode to read every character as it comes. Open a file in "text" mode to treat a "\r\n" sequence as just a '\n' on Windows systems, or read every character as it comes on *nix systems.

This magic newline transformation is called "text mode", and it exists only to make handling textual information easier. That's all.

(This is the same concept, but different consequence, than file transfer protocols, which must convert the source system's newline sequence into the target system's newline sequence when transferring "text" data, but leave the data unmodified when transferring "binary" data.)
it totally depends on the software reading the written file
Well, yeah, that's exactly what I said. Did I at some point imply that it depended on the hardware or something?

if the software's written correctly, using the eof()/feof() functions, it will read straight past the EOF character, stopping at the REAL end-of-file.
For example, if the software opens the file in binary mode.

Open the text file (I used notepad++)
Notepad++ opens files as binary. Otherwise, it would not be able to handle such things as files with inconsistent newlines.

For binary files (or the example file created by the code above), your fgetc() call may return -1/EOF. If it does, you should then check feof() to see whether it's a byte value read from the file, or whether you really have reached the end of data.
Uh-huh. Since we're talking about properly-written software, nobody who knows what they're doing does it like that. Reading from a binary file is done with fread() or std::ifstream::read(), and to know whether it's the EOF, either the value of the get pointer is compared with the file size, or the EOF flag is checked.
Using fgetc() to read from a file is, IMO, moronic.


Duoas: What about the differing behavior when encountering the EOF character?
Now I am more confused.
Can you make the EOF the same across diferent platform, in other words can you force it?
Like I want what my windows EOF be the same as an OSX EOF for the files that I intend to make.
@helios I think we all know that I was demonstrating the theory of how to read a file and tell whether you've reached the end; the finer points of the demonstration (whether notepad++ opens in binary mode or not), and the implementation don't matter.

This whole topic seems to have gone off-track; the OP's question was whether he had to write a different EOF to the end of files for different operating systems. All this fluff about text mode, binary mode and CR/LF sequences is only going to confuse the matter.

[EDIT]

Now I am more confused.


You see?
Last edited on
@Onceler

Is it your intention to place an actual EOF byte (-1) in the data to act as a delimeter, for example, of your own protocol or data format?

For example, using 0x01 as a start marker, and EOF as an end marker

<0x01>Data goes here<EOF>

and then write that to a file?

Jim
Pages: 12