fstream::peek() sets badbit

I have a program with the intention of parsing a downloaded html document to a csv, but the html has incomplete carriage returns, instead of having 0Ah0Dh (or 0x0A,0x0D) it has one or the other. This screws with any fstream functions, so I'm attempting to replace all instances of 0x0A and 0x0D with 'L'.

However, about a quarter through the file, the badbit gets set (the file is just under 40,000 characters long)

here is the function that is misbehaving.

/* getFileLength is a custom function that does just that */

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
int prepBadFile(fstream* file){
    // remember get pointer position ...
    int memoryPos = file->tellg();
    //  ... and seek to beginning
    file->seekg(0, fstream::beg);
    
    // initialize function variables
    fstream::pos_type pos = 0;
    int maxPos = getFileLength(file);
    int iter = 0;
    char c;
    
    cout << "File length: " << maxPos << "\n";
    cin.get();
    
    while(iter != maxPos){
       c = file->peek();
       if (!file->good()) cout << "BadBitBadBitBadBitBadBit!\n";
//       cout << c;
       if(c==0x0A | c==0x0D) {
          file->seekp(pos);
          *file << 'L' << flush;
       }
//       cout << "\n";
       pos = pos + (fstream::pos_type) 1;
       file->seekg(pos);
       iter++;
    }
    return 0;
}


Please help, the last post I asked got no replies within a week, and this has a time constraint!
Last edited on
If it's just a carriage return you want, why not just replace each '0x0A and 0x0D' with "\r" ?

I think there's a problem with flush since the badbit will get set when flush fails. It may be possible to call flush too often but I don't know enough about it...
Last edited on
That was why the file went bad, thnx. I have no fix though, so I am just going to try to deal with the 0x0D's screwing with fstream...
could u try something like this without the need to call flush (pseudo code):
1
2
3
4
5
6
7
8
9
10
int start;
while(file)
{
   getline(file, strdata, delimiter);
   find(0x0A | 0x0D)
   if(start != string::pos)
   {
      replace(0x0A | 0x0D with 'L' in file by using the 'start')  
   }
}

Just an idea...
Topic archived. No new replies allowed.