determining if plain text files were cre

Forum

Forum
UNIX/Linux Programming
determining if plain text files were cre

determining if plain text files were created in linux

i'm a linux noob and i'm writing a program that needs to determine if a plain text file was created in linux or windows. i'm inclined to do something like:

inFile.ignore(numeric_limits<streamsize>::max(), '\n')

then put the last character back into the stream and test it since the newline character is different between linux/windows, but i'm not really sure what i should be telling the program to test for (obviously not '\n') or if this is a bad way of going about it

sorry if this is an overly elementary question, but i didn't get an answer in beginners

hanst99 (2869)

There isn't really a way to do that. Sure, if you open the file in binary mode and check for '\n' (only then you will be able to tell the difference, '\n' automatically resolves to the systems default line break character(s) in text mode), you could tell what kind of newline characters it used. But there is technically no rule that a windows text program has to use \r\l as newline characters - though of course most will do that.

prophetjohn (70)

thanks. i'm probably going beyond what is required for the assignment, but i hate turning in code that could be broken even in obscure circumstances. maybe i'll rethink my algorithm and come up with one that won't have issues with windows v. linux files. if nothing else, i'll just see if he'll tell us which format the input file will be in

firedraco (6243)

If you are using standard fstreams, they automatically convert the newlines to a single '\n' for you, so you don't have to worry.

prophetjohn (70)

well, i'm kind of guessing that that's the problem. i can't find the thread i made in beginners with all the details and i'm about to board a flight, so here's a quick rundown

basically, i wrote all my code in windows, created a test file in windows. the very start of the program reads the input file into an array and prints it back out. it works fine when i compile the code in windows and use a .txt file created in windows. when i compile the code in ubuntu and use the same windows .txt file, i get (seemingly) random line breaks in the middle of the lines. if i recreate the .txt file in gedit and run the code with that, it runs as expected

Last edited on

hanst99 (2869)

What is your code? Would help if we could see that.

prophetjohn (70)

okay, here we go

at the very beginning of my main() i have these two function calls:

1
2
3

    populateWorld(FILE_NAME);

    showWorld();

here's the code for the two functions:

void populateWorld(string file) {
    ifstream fin(file.c_str());    //.c_str() because constructor needs c-string

    //make sure the file opened
    if (fin.fail()) {
        cout << "The file was not found.nPress Enter to quit.";
        getchar();
        exit(0);
    }

    //determine the dimensions of the world and create new two-dimensional array
    getRow(fin);
    getColumn(fin);
    allocateMemory();

    //fill the world with the contents of the file
    for(int i = 0; i < numRows; i++) {
        for(int j = 0; j < numColumns; j++) {
            world[i][j].status = fin.get();
        }
        fin.ignore();   //ignore the newline character
    }
    fin.close();
}

void showWorld() {
    for(int i = 0; i < numRows; i++) {
        for(int j = 0; j < numColumns; j++) {
            cout << world[i][j].status;
        }
        cout << endl;
    }
    getchar();
}

here's the two functions that determine the dimensions of the input file:

void getRow (ifstream &fin) {
    while (!fin.eof()) {
        fin.ignore(numeric_limits<streamsize>::max(), '\n'); //#include <limits>
        numRows++;
    }

    fin.seekg(ios::beg);    //move file pointer back to the beginning of file
    fin.clear();            //clear the 'eof' bit
}


void getColumn (ifstream &fin) {
    char temp = '\0';
    while (temp != '\n' && !fin.eof()) {
        temp = fin.get();
        if (temp == '0' || temp == '1') {
            numColumns++;
        }
    }
    fin.seekg(ios::beg);    //move file pointer back to the beginning of file
    fin.clear();            //clear the 'eof' bit
}

here's the includes that i have:

#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include <limits>

here's what the very first output looks like in windows 7 (the way it should look):

00000000100000001010
00000000010000001001
11100000010100000010
10100100101010101010
00101010010010101000

and here's what it looks like in ubuntu 10.10:

00000000100000001010

00000000010000001001

1110000001010000001
0
101001001010101010
10
00101010010010101000

the weird thing is that right after these two function calls, main enters a loop that calls showWorld() at the end of the loop and the output is what's expected every time after the first. any ideas?

Last edited on

Duthomhas (13200)

The problem is with your input file.
Google around dos2unix.

You can have your code manually avoid the problem by using a custom getline() function, such as:

inline
std::istream& mygetline( std::istream& ins, std::string& s )
  {
  std::getline( ins, s );
  #ifndef _WIN32
    s.erase( s.find_last_not_of( '\r' ) + 1 );
  #endif
  return ins;
  }

Hope this helps.

prophetjohn (70)

thanks. i think that code is a little beyond what i've learned thus far, but i'll check out that google search. i'm probably overthinking this, so i'll just check with my professor when i get back to school and see what he says about it.

Duthomhas (13200)

The code just adds an extra condition if it is compiled on non-Windows machines, to strip the CR code from the end of the line of text.

On Windows, a line of text ends with '\r\n', and is read and translated to just '\n'.
On *nix, a line of text ends with '\n', so no translation occurs. When it reads a line of text from your Windows text file, it has an extra '\r' at the end of line.

I'm glad you are going to your professor for help. Good job!

prophetjohn (70)

haha, i don't know if you are making fun of me or not, but thanks

Zhuge (4664)

It's not that at all. Too many people come here expecting us to do their work for them instead of putting out effort and using resources available to them.

Duthomhas (13200)

In your case, however, you are just inexperienced and you are misunderstanding something simple. It otherwise looks like you are doing a good job!

Remember, when transferring text files between systems, use the TEXT mode. For everything else, use the BINARY mode. Your code shouldn't really have to do special stuff to handle improperly-formed text files.

Topic archived. No new replies allowed.

C++

Forum

determining if plain text files were created in linux