unexpected EOF on reading a file

Dec 11, 2011 at 2:56am
Some of you will recognize this routine from some help you've given me in the past. It's a general-purpose routine to performa formatted read of a number of elements from a text file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
template <class T> int getVector(ifstream & s,
								 const int32_t	& nbrCells,
								 typename vector<T>::iterator iter)
{
	int		rc = 0;
	union
	{		int32_t		temp;
			uint32_t	utemp;
	} intBuffer;

	for (int i = 0; i < nbrCells; i++)
	{
		if (s.flags() & ios::hex)
			s >> intBuffer.utemp;
		else
			s >> intBuffer.temp;

		if (s.good())
		{
			*iter++ = intBuffer.temp;
		}
		else if (s.eof())
		{
			rc = 1;
			cout << "getVector: encountered end of file." << endl;
		}
		else if (s.fail())
		{
			rc = 1;
			cout << "getVector: read fail." << endl;
		}
	}
	return rc;
}


I've encountered a mildly unexpected problem with it. I have a data file that contains exactly 256 elements (the actual number isn't important). All of the elements are separated by line breaks, BUT the final element doesn't have a line break after it.

When I read all 256 in, the s.eof() returns true. All I want to use the eof() function for is to ensure that I've gotten all the data out of the file; that is, that I've fully populated my vector. Am I mis-using the eof() here, and/or is there a better way to perform that test?

There is a workaround for this problem: if I edit my input files and put an end of line after the last element, it seems to work OK. I'd prefer not to have to do this, though.

Thanks for any ideas.
Dec 11, 2011 at 3:29am
It is much simpler to do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
vector<T>::iterator save = iter;
if (s.flags() & ios::hex)
{
    uint32_t tmp;
    while (s >> tmp) *iter++ = tmp;
}
else
{
    int32_t tmp;
    while (s >> tmp) *iter++ = tmp;
}

if (s.bad()) // handle error.
if (distance(save, iter) != nbrCells // handle missing items. 


Dec 11, 2011 at 3:36am
Thanks, PG. I can't quite simplify it that much, since not every function is going to read the entire file. But, I think I can still use your technique with the distance() function, can't I?

Also: is distance() a part of the standard STL? The only online reference to it I can find is for MS variants of C++. This code has to be portable.
Last edited on Dec 11, 2011 at 3:38am
Dec 11, 2011 at 3:44am
Dec 11, 2011 at 4:06am
OK, that's good news. But, I still need the check for good(), so I can update my vector. If I use bad(), it will return true on an EOF, right? Or, are you suggesting I just go ahead with the read, and check that I "got them all" at the end?
Dec 11, 2011 at 5:16am
There is an implicit check for good() in the while(s >> tmp).

The expression s >> tmp yields s. Evaluating s is the same as calling s.good(). That is if (s) and if (s.good()) mean the same thing.
Dec 12, 2011 at 11:07pm
I got pulled off on another task for a couple of days. Here's what I've done with your suggestions; please tell me what you think:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
	int		rc = 0;
	typename vector<T>::iterator save = iter;
	union
	{		int32_t		temp;
			uint32_t	utemp;
	} intBuffer;

	while (s.good() && (distance(save, iter) != nbrCells))
	{
		if (s.flags() & ios::hex)
			s >> intBuffer.utemp;
		else
			s >> intBuffer.temp;
		*iter++ = intBuffer.temp;
	}


If I've done this right, this will read until 1) there's a problem with the input stream, or 2) I've gotten the number of items from the read the calling function requested.
Dec 13, 2011 at 1:31am
It's broken. You call s>>intBuffer.temp and assign that data to the vector before checking if the stream is still good. The extract may have failed, leaving garbage data in the vector.
Dec 13, 2011 at 3:00am
True enough. OK, how about this?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
	typename vector<T>::iterator save = iter;

	if (s.flags() & ios::hex)
	{
		uint32_t	utemp;
		while ((s >> utemp) && (distance(save, iter) != nbrCells))
			*iter++ = utemp;
	}
	else
	{
		int32_t		temp;
		while ((s >> temp) && (distance(save, iter) != nbrCells))
			*iter++ = temp;
	}
Dec 13, 2011 at 8:18pm
OK, I discovered a problem with this: the extraction will execute once more than necessary in some cases. I fixed it like this:

while ((distance(save, iter) != nbrCells) && (s >> utemp))

The question is, can I rely on this? Specifically, can I rely on the compiler (indeed, all compilers) to evaluate the expression on the left *before* evaluating/extracting as per the right?

Thanks...
Topic archived. No new replies allowed.