New line in a text file

Forum

Forum
General C++ Programming
New line in a text file

New line in a text file

Jun 1, 2010 at 7:19pm

I need to write strings in a file by visual C++ under windows and then to read this file in the Linux or vice versa. But I have the problems: if I write file by C++ in Linux and then open in windows by notepad, new line character is lost and everything is on single line. then If I write a file in windows by visual C++ and then open it looks normal. but when I open this file in Linux by vi editor at the end of each lines there are additional character: ^M and this makes problems. even size of files written by same codes are different. The file written under windows is bigger by bites equal to number of lines then file written under Linux.

could you please advice me how to avoid this?
Any help would be appreciating!

Jun 1, 2010 at 7:36pm

Athar (4466)

You need to choose one style, preferably the Linux one. Pretty much every Windows editor other than Notepad will be able to display it correctly (including Wordpad, which also comes with Windows).
Make sure not to write files in text mode, or else your newlines might automatically be replaced with '\r\n' under Windows.

Jun 1, 2010 at 8:16pm

moorecm (1932)

Look into the fileformat option in VIM (:set fileformat=dos).

Also consider a global substitution to remove \r (:s/^M//g). Be sure to enter the ^M (Ctrl-M) sequence with Ctrl-V Ctrl-M.

Last edited on Jun 1, 2010 at 8:22pm

Jun 1, 2010 at 10:22pm

fotoni2 (28)

That's correct if I open this files with another Windows editor it display correctly. My actual problem is that due to this ^M extra characters size of file is different If I write a files in Windows or in Linux. I have remembered addresses of some part of date then I can find it using seekp() and extract but if I will remove those extra characters under Linux then those addresses are not valid any more .

So my question is how should I write new line character in Visual C++ so that it be free from ^M character when I open it in Linux?

thanks!

Last edited on Jun 1, 2010 at 10:31pm

Jun 1, 2010 at 10:32pm

Athar (4466)

So my question is how should I write new line character in Visual C++ so that it be free from ^M character when I open it in Linux?

Just make sure that you open the file in binary mode. Then even on Windows, no ^M should be written.

Jun 2, 2010 at 2:19am

Duthomhas (13276)

No.

The whole point of the textual file mode is to accomodate newline-sequence differences. Open your file normally, use "\n" or std::endl to write newlines to your file. This will do the correct thing on your current platform.

If you wish to transport the file to a different OS, make sure to transfer it using the appropriate text-mode protocol. Failing that, use the dos2unix and unix2dox utilities to convert your text files to use the native newline format.

Hope this helps.

Jun 2, 2010 at 9:56am

fotoni2 (28)

Since I am working mostly with text file and I need to transfer file between different OS use of binary mode should not be good idea. If I would use

dos2unix and unix2dox utilities to convert text files

this will change size of a text file otherwise it's ok. But I have built my program in such way that keeping the size of a file is important.

Is it possible somehow write newline character with dos compilers without this carriage return character?
Will binary mode provide that without another problems?

Last edited on Jun 2, 2010 at 9:58am

Jun 2, 2010 at 10:32am

Athar (4466)

Since I am working mostly with text file and I need to transfer file between different OS use of binary mode should not be good idea.

What makes you think that? Using binary actually is a very good idea. You already know what the problem with text mode is - it's the reason for this thread after all. If you use text mode, the same program will produce different output on different operating systems - usually not something you want. So just keep your newlines to a simple '\n'.

Jun 2, 2010 at 10:49am

Galik (2254)

Just out of curiosity here. I understand that std::endl on a Windows system may produce "\r\n". But is it also the case that sending the single character '\n' using the operator<<() function will do the same? Will the operator<<() function send "\r\n" when the programmer simply specified '\n'?

Last edited on Jun 2, 2010 at 10:49am

Jun 2, 2010 at 10:54am

Athar (4466)

Yes, it will do that even if you explicity write '\n'. That's what I consider evil about text mode - it tampers with the data you're trying to write. Of course, it wastes a lot of performance doing that too.
As long as the files your program generates are never meant to leave your own system, that's okay - in that case it's fine to let text mode choose whatever might work best with the native text editors. In most other cases, it's not okay.

Last edited on Jun 2, 2010 at 10:58am

Jun 2, 2010 at 2:18pm

Duthomhas (13276)

When writing text files, it is always OK -- as it is the correct behavior. [edit] It doesn't hurt performance either... [/edit]

If you write text files containing only the LF newline sequence, then external programs will be unable to properly understand your file on systems where it is not the native format.

Keeping track of the exact size of a text file across systems is the error here. The files will be different sizes, of course, as the newline sequence is a different size. strlen( "\n" ) == 1; strlen( "\r\n" ) == 2.

  win_text_size = linux_text_size + number_of_newline_chars_in_file

Unless you have a specific requirement otherwise, use the system-default newline sequence.

  ■ Windows = CR LF
  ■ Linux = LF
  ■ MAC < 0SX = CR

It is incumbent upon the user of said systems to properly maintain their text files when transferring them to other systems. Besides being the norm, a lot of transfer software is designed to automatically handle this properly for you, so it is not such a great burden on your users.

What is a burden, and what will be considered a defect in your software by users, is having a program that does not comply with the system-default newline sequence, making access to text files produced by your program on the same system obnoxious. Don't do that to your users.

Remember that text files are not designed to be a specific size -- they are specifically variable record length. If you must monitor the file size in some way, make sure to account for the equation above, so that you can match to what you have stored. (Remember that your users can still booger this by editing the file directly... something you cannot do with binary [non text!] files.)

If we may ask, what exactly are you trying to accomplish. Perhaps we can suggest a better way?

Last edited on Jun 2, 2010 at 2:19pm

Jun 2, 2010 at 3:46pm

fotoni2 (28)

thanks a lot all of you for this helpful discussion!

Of course my task could be done in much more intelligent way than I am doing it now. but I don't know..

my task is following:
I have blocks of data containing approximately (100-300) chars. I want to save and handle them easily. also add new blocks. Number of those blocks should be thousands and increasing arithmetically.

I am just writing those blocks in text file and saving tracks of them with theirs specifiers. But when I transfer this file from DOS to Unix and convert to remove \r chars the tracks of blocks are missed.

Any hint for even completely different better method would be appreciating.

Jun 2, 2010 at 4:32pm

Duthomhas (13276)

If I understand correctly, you want to be able to have an offset into the file to index a specific block instantly, something like:

1
2

  fin.seekg( block_index[ some_N ] );
  getline( fin, block );

First, how are you storing said indices? If you keep them separate from the file, then you are asking for trouble, because, again, a text file can be easily modified by anyone with any simple editor.

It would be simplest to simply re-create the index list, once, at the beginning of your program. (You would have to do this anyway, in addition to code to check to see if the file is still in sync with your list of indices, so you might as well just do it to begin with and avoid all the extra checks and extra/separate storage for the list.)

Hope this helps.

Jun 2, 2010 at 4:46pm

Galik (2254)

@fotoni2

I am not sure that I completely understand what you are doing but if you are treating your text files as a primitive database then I would have thought it best to treat everything as binary data. Even if its textual content.

Jun 2, 2010 at 7:37pm

fotoni2 (28)

how are you storing said indices?

Yes, I am storing indexes in the another text file.

@Galik
yes, it can be considered as a primitive database.

thanks!

Last edited on Jun 2, 2010 at 7:39pm

Topic archived. No new replies allowed.