Identifying files if they are Windows Mac or Linux

Hello.

How can I identify a file if it is a Windows, Mac or Linux/Unix file? If possible, I would like to use normal ANSI C.

Thanks for some comments.
Last edited on
In terms of files, there is no such thing as a Windows, Mac or *nix file. What effect do you wish to achieve?
Last edited on
For text files you can check the newlines.
Windows uses \r\n
POSIX/UNIX/Unix-like use \n
Mac (pre OS X) used \r
What I mean is that Mac, *inux and Windows have different end-of-lines: \n (UNIX/Linux) \n\r (Windows) \r (MAC). I need some code which identifies these 'types of files'.
We published at the same time, Bazzy. :-)

So, I have to analyse the whole thing in dependency on the EOL. The EOF gives no signature?
A simple way would be to read the file until you get a \r or a \n or a \r\n sequence.
Why do you need to know which end of line style the file has?
> Why do you need to know which end of line style the file has?

What I want to do: If there is an ASCII file with an EOF standard that is not compatible with the current os (because the file was created on a different os), I would like to write a new file with the EOF standard of the os system.

BTW: Thanks for the help.
Last edited on
Check out the source code for unix2dos and dos2unix.
Under Linux everything is fine, I can 'identify' the file, if it has been created by a Windows, Mac or Linux machine. However, the same code under a Windows compiler shows that Windows does not see the '\r\n':

1
2
3
4
5
6
        if( *(buffer-2) == '\r' && *(buffer-1) == '\n')
            OS_win;
        else if( *(buffer-1) == '\r' )
            OS_mac;
        else if( *(buffer-1) == '\n' )
            OS_linux;


Linux recognizes 'OS_win' but under Windows, OS_win is never reached (result is OS_linux). Why is this?
Last edited on
closed account (zwA4jE8b)
try reading in and printing out a couple lines from a text file char by char to see what is actually there. ya know, create a text file on linux, read all the characters, create a text file in windows, read the chars, create one in macos, read the characters. At least that way you will know exactly what the end of line characters are.
You need to open the file as binary.
> You need to open the file as binary.
Right. Here is what works under Windows and Linux:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

using namespace std;

int main()
{
    FILE *fp1, *fp2;
    int i, j;
    char path1[256], path2[256];
    unsigned char buffer[1000000];
    unsigned long fileLen;


    strcpy(path1,"Y:\\input.txt");
    strcpy(path2,"Y:\\result.txt");

    if((fp1=fopen(path1, "rb")) == NULL)
    {
        printf("Could not open the file No. 1!\n");
        return false;
    }
    if((fp2=fopen(path2, "wb")) == NULL)
    {
        printf("Could not open the file No. 2!\n");
        return false;
    }



    fseek(fp1, 0, SEEK_END);
    fileLen=ftell(fp1);
    fseek(fp1, 0, SEEK_SET);
    fread(buffer, fileLen, 1, fp1);

    j = 0;
    for(i = 0;i < (int) fileLen; i++)
    {
        if(buffer[i] == 0x0D)
        {
            //buffer[i] = 0x0D;
            //fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            buffer[i] = 0x0A;
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            j++;
        }
        else
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
    }

    printf("Number of changes: %d\n",j);


    fclose(fp1);
    fclose(fp2);

    return true;
}

For: Mac file to Linux file.

What do you think?
Last edited on
Why all the seeking? Shouldn't it just work on streams?

Why don't you make it portable so it takes the src/dst platform and encode the file as specified?
> Why all the seeking? Shouldn't it just work on streams?
Well, I found this somewhere. I guess it is to determine the size fileLen.

> Why don't you make it portable so it takes the src/dst platform and encode the file as specified?
What do you mean? I'm a beginner, sorry. Modify the code.
Well, I found this somewhere. I guess it is to determine the size fileLen.
Don't use stat() if you need to get file metadata. You don't need to know the file size to perform the conversion, you just need to read a buffer full. The buffer should optimally be a multiple of the disk block size, but it doesn't really matter as there's loads of buffering going on.

What do you mean? I'm a beginner, sorry. Modify the code.
You've written it to convert EOL sequences to Windows' default. But you could just as easily format it for other OS'. Why hard code it for Windows when you could make it more general? And if you use standard C/C++, it'll compile on other platforms.
> You've written it to convert EOL sequences to Windows' default. But you could just
> as easily format it for other OS'. Why hard code it for Windows when you could make
> it more general? And if you use standard C/C++, it'll compile on other platforms.

Ah, I see. - Of course, one could do it more general. It is just an example. One has only to change this part in order to adapt the file to the OS:

1
2
3
4
5
6
7
8
9
10
...
if(buffer[i] == 0x0D)
        {
            //buffer[i] = 0x0D;
            //fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            buffer[i] = 0x0A;
            fwrite(&buffer[i], sizeof(buffer[i]), 1, fp2);
            j++;
        }
...


0x0D = /r
0x0A = /n

Windows = 0x0D + 0x0A
Linux = 0x0A
Mac = 0x0D
Topic archived. No new replies allowed.