Identifying files if they are Windows Ma

Forum

Forum
General C++ Programming
Identifying files if they are Windows Ma

Identifying files if they are Windows Mac or Linux

Feb 24, 2011 at 6:33pm

Hello.

How can I identify a file if it is a Windows, Mac or Linux/Unix file? If possible, I would like to use normal ANSI C.

Thanks for some comments.

Last edited on Feb 24, 2011 at 7:01pm

Feb 24, 2011 at 7:17pm

Moschops (7244)

In terms of files, there is no such thing as a Windows, Mac or *nix file. What effect do you wish to achieve?

Last edited on Feb 24, 2011 at 7:17pm

Feb 24, 2011 at 7:34pm

Bazzy (6281)

For text files you can check the newlines.
Windows uses \r\n
POSIX/UNIX/Unix-like use \n
Mac (pre OS X) used \r

Feb 24, 2011 at 7:35pm

baluba (17)

What I mean is that Mac, *inux and Windows have different end-of-lines: \n (UNIX/Linux) \n\r (Windows) \r (MAC). I need some code which identifies these 'types of files'.

Feb 24, 2011 at 7:36pm

baluba (17)

We published at the same time, Bazzy. :-)

So, I have to analyse the whole thing in dependency on the EOL. The EOF gives no signature?

Feb 24, 2011 at 8:00pm

Bazzy (6281)

A simple way would be to read the file until you get a \r or a \n or a \r\n sequence.
Why do you need to know which end of line style the file has?

Feb 25, 2011 at 6:57am

baluba (17)

> Why do you need to know which end of line style the file has?

What I want to do: If there is an ASCII file with an EOF standard that is not compatible with the current os (because the file was created on a different os), I would like to write a new file with the EOF standard of the os system.

BTW: Thanks for the help.

Last edited on Feb 25, 2011 at 6:57am

Feb 25, 2011 at 8:45am

kbw (9488)

Check out the source code for unix2dos and dos2unix.

Feb 25, 2011 at 1:38pm

baluba (17)

Under Linux everything is fine, I can 'identify' the file, if it has been created by a Windows, Mac or Linux machine. However, the same code under a Windows compiler shows that Windows does not see the '\r\n':

        if( *(buffer-2) == '\r' && *(buffer-1) == '\n')
            OS_win;
        else if( *(buffer-1) == '\r' )
            OS_mac;
        else if( *(buffer-1) == '\n' )
            OS_linux;

Linux recognizes 'OS_win' but under Windows, OS_win is never reached (result is OS_linux). Why is this?

Last edited on Feb 25, 2011 at 1:39pm

Feb 25, 2011 at 3:02pm

closed account (zwA4jE8b)

try reading in and printing out a couple lines from a text file char by char to see what is actually there. ya know, create a text file on linux, read all the characters, create a text file in windows, read the chars, create one in macos, read the characters. At least that way you will know exactly what the end of line characters are.

Feb 25, 2011 at 3:17pm

kbw (9488)

You need to open the file as binary.

Feb 25, 2011 at 4:32pm

baluba (17)

> You need to open the file as binary.
Right. Here is what works under Windows and Linux:

#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

using namespace std;

int main()
{
    FILE *fp1, *fp2;
    int i, j;
    char path1[256], path2[256];
    unsigned char buffer[1000000];
    unsigned long fileLen;


    strcpy(path1,"Y:\\input.txt");
    strcpy(path2,"Y:\\result.txt");

    if((fp1=fopen(path1, "rb")) == NULL)
    {
        printf("Could not open the file No. 1!\n");
        return false;
    }
    if((fp2=fopen(path2, "wb")) == NULL)
    {
        printf("Could not open the file No. 2!\n");
        return false;
    }



    fseek(fp1, 0, SEEK_END);
    fileLen=ftell(fp1);
    fseek(fp1, 0, SEEK_SET);
    fread(buffer, fileLen, 1, fp1);

    j = 0;
    for(i = 0;i < (int) fileLen; i++)
    {
        if(buffer[i] == 0x0D)
        {
            //buffer[i] = 0x0D;
            //fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            buffer[i] = 0x0A;
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            j++;
        }
        else
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
    }

    printf("Number of changes: %d\n",j);


    fclose(fp1);
    fclose(fp2);

    return true;
}

Edit & run on cpp.sh

For: Mac file to Linux file.

What do you think?

Last edited on Feb 25, 2011 at 4:47pm

Feb 25, 2011 at 4:36pm

kbw (9488)

Why all the seeking? Shouldn't it just work on streams?

Why don't you make it portable so it takes the src/dst platform and encode the file as specified?

Feb 25, 2011 at 4:47pm

baluba (17)

> Why all the seeking? Shouldn't it just work on streams?
Well, I found this somewhere. I guess it is to determine the size fileLen.

> Why don't you make it portable so it takes the src/dst platform and encode the file as specified?
What do you mean? I'm a beginner, sorry. Modify the code.

Feb 25, 2011 at 5:15pm

kbw (9488)

Well, I found this somewhere. I guess it is to determine the size fileLen.

Don't use stat() if you need to get file metadata. You don't need to know the file size to perform the conversion, you just need to read a buffer full. The buffer should optimally be a multiple of the disk block size, but it doesn't really matter as there's loads of buffering going on.

What do you mean? I'm a beginner, sorry. Modify the code.

You've written it to convert EOL sequences to Windows' default. But you could just as easily format it for other OS'. Why hard code it for Windows when you could make it more general? And if you use standard C/C++, it'll compile on other platforms.

Feb 25, 2011 at 5:30pm

baluba (17)

> You've written it to convert EOL sequences to Windows' default. But you could just
> as easily format it for other OS'. Why hard code it for Windows when you could make
> it more general? And if you use standard C/C++, it'll compile on other platforms.

Ah, I see. - Of course, one could do it more general. It is just an example. One has only to change this part in order to adapt the file to the OS:

...
if(buffer[i] == 0x0D)
        {
            //buffer[i] = 0x0D;
            //fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            buffer[i] = 0x0A;
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            j++;
        }
...

0x0D = /r
0x0A = /n

Windows = 0x0D + 0x0A
Linux = 0x0A
Mac = 0x0D

Topic archived. No new replies allowed.