How to read in a line that INCLUDES spaces

Apr 5, 2020 at 6:06pm
I am trying to read information from a file into string variables. Here is how the data appears in the file:

TFTFTFTFTFTFTFTFTFTF ABC12345 TTFFTTFF TTTTTTTTTTF

The first two spaces are delimiters between the strings I want. However, that third space must be INCLUDED in the third string. So by the end I'd like to have three strings:

str1 = "TFTFTFTFTFTFTFTFTFTF"
str2 = "ABC12345"
str3 = "TTFFTTFF TTTTTTTTTTF"

That space in the third string is giving me real problems. I just can't figure out a way to deal with it. Following is the code that has given me the most success:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
 
int main()
{ 

	string answersString = "";
	string studentAnswersString = "";
	string idString = ""; 
	char answerChar;
	char idChar;
	char studentAnswerChar;   

	ifstream trialRunsFile;

	trialRunsFile.open("testScoresTrial.txt");

	while (!trialRunsFile.eof())
	{
		for (int i = 0; i < 20; i++)
		{
			trialRunsFile >> answerChar; 
			answersString = answersString + answerChar; 
		} 

		trialRunsFile.ignore(1);

		for (int j = 0; j < 8; j++)
		{
			trialRunsFile >> idChar;
			idString = idString + idChar; 
		}

		trialRunsFile.ignore(1);

		for (int i = 0; i < 20; i++)
		{
			trialRunsFile >> studentAnswerChar; 
			studentAnswersString = studentAnswersString + 
                        studentAnswerChar; 
		} 	
	}

	cout << "answersString is " << answersString << endl;
	cout << "idString is " << idString << endl;
	cout << "studentAnswersString is " << studentAnswersString << endl;   

	trialRunsFile.close();   	

}


Here is the output of this code:

answersString is TFTFTFTFTFTFTFTFTFTF
idString is ABC12345
studentAnswerString is TTFFTTFFTTTTTTTTTTFF

Now this is VERY CLOSE to what I want, but the third string is wrong in two ways: it seems to have ignored the space entirely, but instead of ignoring the space and only printing 19 of the 20 characters, it has gone ahead and appended an extra 'F' on the end for some reason! Obviously my code is telling the computer to do this, but I can't for the life of me figure out where that command happens.
Apr 5, 2020 at 7:31pm

1
2
3
4
while !eof:
  read input (which may have failed)
  do something with (possibly bad) input
  loop

The answer is to always check success or failure immediately after attempting to read input. C++ makes that easy:

1
2
3
4
while (file >> var)
{
  do something with known-to-be-valid input
}


The student answers appear to be positional — a blank means that the student did not answer that question. It is entirely possible that the student did not answer the first question, so using >> is not a good idea. You can get every character specially, but you can also just use getline():

1
2
3
4
trialRunsFile >> answerKey;  // get the full answer key (no spaces allowed)
trialRunsFile >> id; // get the ID
trialRunsFile.get(); // skip the single space following the ID
getline( trialRunsFile, studentAnswers ); // get all characters, including spaces 

You can easily put this in a loop:

1
2
3
4
5
6
7
8
9
10
11
12
13
while (true)
{
  // attempt to get input
  trialRunsFile >> answerKey;  // get the full answer key (no spaces allowed)
  trialRunsFile >> id; // get the ID
  trialRunsFile.get(); // skip the single space following the ID
  getline( trialRunsFile, studentAnswers ); // get all characters, including spaces

  // failure?
  if (!trialRunsFile) break;

  // do stuff here
}

Even better, you can write a function to get the input and returns whether it succeeded:

1
2
3
4
5
6
7
bool getAnswersFromFile( istream& file, string& answerKey, string& id, string& studentAnswers )
{
  file >> answerKey >> id;
  file.get();
  getline( file, studentAnswers );
  return file.good();
}
1
2
3
4
while (getAnswersFromFile( trialRunsFile, answerKey, id, studentAnswers ))
{
  // do something with known-to-be-valid input
}

When you start using classes you will codify this into a C++-ism:

1
2
3
4
5
6
7
8
9
10
11
12
13
struct TFExam
{
  string key;
  string student_id;
  string student_answers;
};

std::istream& operator >> ( std::istream& ins, TFExam& exam )
{
  ins >> exam.key >> exam.student_id;
  ins.get();
  return getline( ins, exam.student_answers );
}
1
2
3
4
5
6
7
8
9
10
11
12
13
TFExam e;
while (trialRunsFile >> e)
{
  // do stuff with known-to-be-valid input
  if (e.key == e.student_answers)
  {
    std::cout << "Perfect score!\n";
  }
  else
  {
    ...
  }
}

Hope this helps.
Apr 5, 2020 at 7:44pm
Very helpful, thank you!

My trepidation with using getline() is that ultimately, I want to repeat this process on data for many students, but there won't be line breaks in the data. So if I were to use getline() on the first student's data, it seems that would read in the entire rest of the data, because I can't specify ' ' as a delimiter because then it would not get all the next student's answers, if those answers include a ' ' character.

So my problem here is that sometimes I need to use ' ' as a delimiter and sometimes I need to read it in to the string itself, and I don't see how to tell the computer when to do what.
Apr 5, 2020 at 7:47pm
There's never line breaks? Show us an excerpt of the file that shows at least 3 students, starting from the beginning.
There must be some logic behind the way the file is structured, or your file might as well be random noise.

Is the format always:
{TF string without spaces} {student id without spaces} {TF answers that may contain useless spaces}
{TF string without spaces 2} {student id 2 without spaces} {TF 2 answers that may contain useless spaces}
...

?

or is it like:
{TF string without spaces} {student id without spaces} {TF answers that may contain useless spaces} {TF string without spaces 2} {student id 2 without spaces} {TF 2 answers that may contain useless spaces} ...


______________________________________

It may be necessary that you count the length of the first string, and then you only read that number of characters after the student id is over.

What happens when the first student answer is blank? Is it guaranteed that there will only be 1 space between student id and start of student answers?
Last edited on Apr 5, 2020 at 7:54pm
Apr 5, 2020 at 7:59pm
The problem statement does not indicate that there are line breaks, though I agree that would make more sense. It just states the format of each student's data and says there are more than 150 students. The data file is not provided, so I can't really see how the author intended for the input data to be formatted. It would certainly make life easier if it were

{TF string without spaces} {student id without spaces} {TF answers with useless spaces}
{TF string without spaces} {student id without spaces} {TF answers with useless spaces}
{TF string without spaces} {student id without spaces} {TF answers with useless spaces}
...

However, I didn't want to assume this without being explicitly told so in the problem statement.

As for what happens if the first student answer is blank, I hadn't considered that contingency because I was hoping to find a way to just read all the character, including any blanks, into a string and then deal with the blanks once the strings were read in. I.e. I wanted to read the blanks as I would any other character.
Apr 5, 2020 at 9:03pm
Alas, with only that information you will need to do as Ganado says.

Is it safe to say that there is a blank between student ID and student answers?
Apr 5, 2020 at 9:26pm
Right, and just to illustrate this some more,
- Read in "real answers"
- Read in student id
- Read in space between student id and start of student answers
- Read in student's answers, where the length(student answers) == length(real answers)

I think it's unreasonable that you're not allowed to see what the files actual look like.

I would do something like this.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Example program
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
using namespace std;

int main()
{    
    // === READ THIS ===
    
    // UNCOMMENT THIS, replacing filename with your own
    //ifstream file("scores.txt");
    
    // COMMENT OUT THIS, use real file instead
    istringstream file("TFTFTFTFTFTFTFTFTFTF ABC12345 TTFFTTFF FFFFTTTTTTF\n FFFF    student2 T  F   \nTFFTFT student333  FFTF ");
    
    // ASSUMES that length(actual answers) == length(student's responses, including spaces)
    // ASSUMES there is only 1 whitespace character between student id and start of student responses

    string real_answers;
    while (file >> real_answers)
    {
        string student_id;
        file >> student_id;
        
        char space; // space between student id and student responses
        file.get(space);
        
        string student_answers;
        
        int num_questions = (int)real_answers.length();
        for (int i = 0; i < num_questions; i++)
        {
            char student_answer;
            file.get(student_answer);     
            student_answers += student_answer;
        }
        
        cout << "Student: " << student_id << '\n' <<
                "Student's answers: " << student_answers << '\n' << 
                "Real answers:      " << real_answers << "\n\n";
    }
}

Student: ABC12345
Student's answers: TTFFTTFF FFFFTTTTTTF
Real answers:      TFTFTFTFTFTFTFTFTFTF

Student: student2
Student's answers: T  F
Real answers:      FFFF

Student: student333
Student's answers:  FFTF 
Real answers:      TFFTFT

(I put newlines in the file, but it works even without newlines in the file)
Last edited on Apr 5, 2020 at 9:31pm
Apr 5, 2020 at 9:29pm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
#include <vector>
using namespace std;

struct Test
{
   string correct;
   string id;
   string response;
};

istream & operator >> ( istream &in, Test &test )
{
   string line;
   char space;
   
   getline( in, line );
   stringstream ss( line );
   ss >> test.correct >> test.id;
   ss.get( space );
   getline( ss, test.response );
   test.response.resize( test.correct.size() );
   return in;
}

int main()
{ 
   istringstream in( "TFTFTFTFTFTFTFTFTFTF ABC12345 TTFFTTFF TTTTTTTTTTF\n"
                     "TFTFTFTFTFTFTFTFTTTT DEF67890  TFFTTFF TTTTTTTTTTF\n"
                     "TFTFTFTFTFTFTFTFFFFF GHIabcde TTFFTTFF TTTTTTTTTT\n" );

   vector<Test> scripts;
   for ( Test t; in >> t; ) scripts.push_back( t );

   for ( Test t : scripts )
   {
       cout << "Point of truth: " << t.correct  << '\n';
       cout << "Student id:     " << t.id       << '\n';
       cout << "Student answer: " << t.response << "\n\n";
   } 
}


Point of truth: TFTFTFTFTFTFTFTFTFTF
Student id:     ABC12345
Student answer: TTFFTTFF TTTTTTTTTTF

Point of truth: TFTFTFTFTFTFTFTFTTTT
Student id:     DEF67890
Student answer:  TFFTTFF TTTTTTTTTTF

Point of truth: TFTFTFTFTFTFTFTFFFFF
Student id:     GHIabcde
Student answer: TTFFTTFF TTTTTTTTTT
Last edited on Apr 5, 2020 at 9:35pm
Apr 5, 2020 at 9:31pm
There is supposed to be a blank between student ID and student answers. I am dealing with those blanks between student ID and studentAnswers by using igore(1).

I essentially did do what Ganado said in the code I posted, reading in the next
20 characters for studentAnswers via a for loop. (Since I know that each studentAnswer string will be 20 characters long.)

However, as you see in the output, this ends up not including the ' ' character in the string and mysteriously appends an additional 'F' at the end. So I get a 20-character string, but with no space and with a superfluous 'F'.

Is there a way to use getline() to only acquire the next 20 characters? It seems like it shouldn't be quite this tricky to just read the next however-many characters into a string, no matter what they are.

Apr 5, 2020 at 9:34pm
Is there a way to use getline() to only acquire the next 20 characters?
Yes, you can use: http://www.cplusplus.com/reference/istream/istream/getline/

Or you can just loop 20 times and use file.get(my_char);
The latter is a bit more flexible.

(Try running mine or lastchance's code.)
Last edited on Apr 5, 2020 at 9:39pm
Apr 5, 2020 at 11:45pm
I already answered you about the extraneous 'F'. It is not mysterious. It is left over from the previous loop because you didn't stop processing when your attempt to get input failed.
Apr 6, 2020 at 3:20am
Unfortunately lastchance's code is not immediately useful to me, since it uses a lot of stuff that I just haven't seen yet as a beginner. I will hang on to it though, to look at new things in a context I understand.

I ran ganado's code and it compiles and runs just fine. It doesn't do what I want it to do, however. When I run it on the following input

TFTFTFTFTFTFTFTFTFTF ABC12345 TTFFTTFF TTTTTTTTTTF ABC6789 TTFTTFTFTFTFTFTF TFT

I get the following output:


Student: ABC12345
Student's answers: TTFFTTFF TTTTTTTTTTF
Real answers:      TFTFTFTFTFTFTFTFTFTF

Student: TTFTTFTFTFTFTFTF
Student's answers: TFTTTTTT
Real answers:      ABC67890


One problem seems to be the while condition. After the student's answers are read in, the while condition reads the student ID into the real answers string. The real answers string is only to be read in the one time, though.

I will work with ganado's code. I think I can make it work. There are a few things in there that I'm not familiar with. E.g. I'm not familiar with opening a file by using a file function. I have only seen ifstreamObject.open("fileName.xxx"). I also haven't seen this

int num_questions = (int)real_answers.length();


Why is real_answers.length() prefixed with (int) there?

Thanks everyone for your help and for bearing with me. Learning C++ is going well, but reading data in from files is definitely my biggest struggle so far.
Last edited on Apr 6, 2020 at 3:22am
Apr 6, 2020 at 3:34am
Ok, I have made just a couple amendments to ganado's code. I moved the read-in to real_answers outside the loop so that it would only execute once, and changed the while loop to a for loop iterating over the number of students. (Disadvantage to that, I suppose, is that it assumes foreknowledge of the number of students. Perhaps I could do a while loop conditioned on !infile.eof()?)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Example program
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
using namespace std;

int main()
{    
    
    int student_count = 2;   
    ifstream file("testScoresTrial.txt");
    
    // ASSUMES that length(actual answers) == length(student's responses, including spaces)
    // ASSUMES there is only 1 whitespace character between student id and start of student responses

    string real_answers;
    file >> real_answers; 
    
    for (int i = 0; i < student_count; i++)
    {
        string student_id;
        file >> student_id;
        
        char space;
        file.get(space);
        
        string student_answers;
        
        int num_questions = (int)real_answers.length();
        
        for (int j = 0; j < num_questions; j++)
        {
            char student_answer;
            file.get(student_answer);     
            student_answers += student_answer;
        }
        
        cout << "Student: " << student_id << '\n' <<
                "Student's answers: " << student_answers << '\n' << 
                "Real answers:      " << real_answers << "\n\n";
    }
}


On the input

TFTFTFTFTFTFTFTFTFTF ABC12345 TTFFTTFF TTTTTTTTTTF ABC6789 TTFTTFTFTFTFTFTF TFT

the output is

Student: ABC12345
Student's answers: TTFFTTFF TTTTTTTTTTF
Real answers:      TFTFTFTFTFTFTFTFTFTF

Student: ABC67890
Student's answers: TTFTTFTFTFTFTFTF TFT
Real answers:      TFTFTFTFTFTFTFTFTFTF


which was my goal.

Last edited on Apr 6, 2020 at 3:37am
Topic archived. No new replies allowed.