Actually reading the newline

Jun 14, 2011 at 7:26am
Currently I'm having problems reading input files. The problem I'm encountering is that I actually want to read the newline character instead of using it as a delimiter.

After doing some testing it looks like std::getline, which I'm using with an ifstream, reads the newline and puts it in the string as follows: "\\n ". Does anyone know why this is happening? And more importantly how I should read the newline so that it still works like a newline in the resulting string.

I think the problem is pretty self explanatory, if you'd like to see some code containing the problem please say so, then I'll post my code.
Jun 14, 2011 at 8:59am
Post your code 'till now.
Jun 14, 2011 at 9:39am
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
std::ifstream is(filename.c_str(), std::ios::in);
intermediate_mapper_format::source s;
int line_number = 1;
while(!is.eof()) {
	//Get the tag name, which is first part before the separator
	std::string name;
	std::getline(is, name, ',');
	boost::algorithm::trim(name);

	//Get the value
	std::string value;
	char c = is.peek();
	//Skip possible white space
	while(c == ' ') {
		is.ignore(1);
		c = is.peek();
	}
	if (c == '\'' || c == '"') {
		//If it starts with ' or " then we should also copy \n characters that are within the parenthesis pair
		std::string partial_value;
		is.ignore(1);
		bool done = false;
		do {
			std::getline(is, partial_value, c);
			//Check if we're done or if the '/" has another occurence in the value
                        if ('\n' == is.peek()) {
				is.ignore(1);
				value += partial_value;
				done = true;
			} else {
				value += partial_value;
				value += c;
			}
		} while (!done);
	} else {
		std::getline(is, value);
	}

	//Add the name and value to the source format
	intermediate_mapper_format::source::iterator i = s.find(name);
	if (i != s.end()) {
		intermediate_mapper_format::tag_values tvs = i->second;
		tvs.push_back(std::make_pair(line_number, value));
	} else {
		intermediate_mapper_format::tag_values tvs;
		tvs.push_back(std::make_pair(line_number, value));
		s.insert(std::make_pair(name, tvs));
	}
	line_number++;
}
intermediate_mapper_format::name_description nd;
m_imf = intermediate_mapper_format_ptr(new intermediate_mapper_format(s, nd));
is.close();


Here's the code. To give a bit more background information, it is used to parse CSV files which contain per line a name and value. However if the value starts with ' or " it is possible to have large pieces of text containing newlines. I've tested this with a CSV with the following lines:
1
2
3
4
5
naam, "individual model 1"
omschrijving,'raar
vreemd's
lelijke omschrijving'
puntjes,101 

I've got a work around where I replace "\\n " with "\n" which works, but is ofcourse an ugly solution.
Last edited on Jun 14, 2011 at 9:41am
Jun 14, 2011 at 11:11am
'\n' in memory is represented by '\\n' in C code, because by default, when the Compiler will see \ in a string, it will not handle it as a normal char, instead it will look at the next character to find a special haractert code (like \n for example :-P ). That means that the '\' char has to be written '\\' in a C code string, giving '\\n' in C code for '\n' in memory
Jun 14, 2011 at 12:11pm
'\n' in memory is represented by '\\n' in C code, because by default, when the Compiler will see \ in a string, it will not handle it as a normal char, instead it will look at the next character to find a special haractert code (like \n for example :-P ). That means that the '\' char has to be written '\\' in a C code string, giving '\\n' in C code for '\n' in memory


That would be the case if I was reading my input from memory. I'm reading from a file, so this should not be the case.
Jun 14, 2011 at 2:27pm
closed account (zb0S216C)
luc atlis wrote:
That would be the case if I was reading my input from memory. I'm reading from a file, so this should not be the case.

When a file is opened, it's loaded into memory. In a sense, you're reading from memory.

wazzak
Jun 15, 2011 at 6:12am
Ok, so if I understand this correctly. Whenever you read from a file and you read the newline and don't discard it, you never end up with an actual newline character and what I want to do is impossible.

On a sidenote, I understand why this would give me a "\\n" where there should be a newline. But why does it also add another space? I'm ending up with "\\n " where a newline is encountered in the input file.
Jun 15, 2011 at 11:07am
You still have the solution of reading your file in binary mode, you'll get a unique string containing the whole file, with the new lines (encoded as CR or CR+LF depending on your system), then manually looking for the lines
Jun 15, 2011 at 11:26am
Well, then I'm sticking with my ugly solution. Thanks everybody for the help. At least I learned some more about reading files.
Jun 15, 2011 at 11:50am
closed account (S6k9GNh0)
Most people don't even use delimiters. They'll read the file entirely into memory and parse it using some algorithm (which may use delimiters), turning it into something useful...
Last edited on Jun 15, 2011 at 12:03pm
Topic archived. No new replies allowed.