logic problem with parsing input file.

I have to parse this input file, the file can contain comments much like shell scripts start with a '#'. Blank lines also should be ignored. 2 functions are used, size=n,n and searchFor=word. At least size should be parsed before parsing the puzzle. All other lines are puzzle lines, for example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# this is a comment
size=15,9   # size of a puzzle

searchFor=apple  # search for this word in the puzzle
# the following is puzzle

appappleasdasqw
fsdgeagapplevqw
sdgebepsdfgpltr
aeherhadefgqaio
aegfaegreargeio
onwpeoappledfui
afgergegregffvb   # this puzzle is pretty nonsensical
fsdgeagapplevqw
sdgebepsdfgpltr

# but I couldn't be bothered making a good one. 


The input file can contain errors and we are supposed to handle these appropriately. However puzzles will contain printable chars only(no whitespace) with the exception of '.', '#'. Obviously we need a size dimension to create the array dynamically before processing the puzzle, however the searchFor can come after the puzzle. We must construct, solve, and destroy one puzzle at a time, puzzles will not be stored in memory(I'm guessing this means something other than dynamic char **), and we will not process the input file more than once.

My problem is the logic I use to search for the strings. I tried using a find construct as a conditional which didn't seem to work, however I do need to have some kind of if/else if...
My solution is finding size= and searchfor= when trying to process puzzle lines.
Anyone have any ideas what I can use as a conditional?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
void processFile(string infile)
{
   string line="", tmp_line="", word="";
   bool gotSize=false, gotWord=false;
   size_t found;
   string size="SIZE=";
   string search="SEARCHFOR=";
   int row=0, col=0;

   ifstream ifs(infile.c_str());
   if (!ifs.is_open())
   {
      cout << "Error opening file" << endl;
      exit (1);
   }
   // else ->
   int line_count=0;
   while (!ifs.eof())
   {
      getline(ifs,line);
      cout << ++line_count << "\n";
      if (line == "" || line[0] == '#')
      {
         ;; //== line is comment, will be ignored ==// 
      }
      else
      {
         //== function line or puzzle line ==//
         tmp_line = line;
         convert2Upper(tmp_line);
         found = tmp_line.find(size);  // find SIZE= in line
         if (found!=string::npos)
         {
            cout << "Found size= at: " << int(found);
            gotSize = parseSize(line,row,col);
            // gotSize is true if both row and col contain numbers greater than 1
         }

         found = tmp_line.find(search);  // find SEARCHFOR= in line
         if (found!=string::npos)
         {
            cout << "Found searchFor= at: " << int(found);
            gotWord = parseWord(line,word);
         }

         // else we got a puzzle line, check we have a size
         if (gotSize)
         {
            string tmp_puzLine;
            found = line.find_first_of(" .#\t\n");
            tmp_puzLine = line.substr(0,found);
            cout << "This is a puzzle line: " << tmp_puzLine << endl;
         }
         /*
         Unfortunately the above also finds size=n,n and searchfor=word
         These are things I want it to ignore and only grab puzzle lines.
         */
      }
      cout << endl;
   }
   ifs.close();
}

bool parseWord(string line, string& word)
{
   int start, end;
   start = line.find('=');
   end = line.find(" #\t\n");

   int len1 = end - start - 1;
   word = line.substr(start+1,len1);

   if (word.size()>1) return true;
   else
   {
      cout << "Word is too small to play with" << endl;
      return false;
   }
}

bool parseSize(string line, int& row, int& col)
{
   int size_pos[3]={ 0 };
   string strRow, strCol;

   size_pos[0]=line.find('='); // indexes
   size_pos[1]=line.find(',');
   size_pos[2]=line.find(" #\t\n");

   int len1 = size_pos[1] - size_pos[0] - 1;
   strRow = line.substr(size_pos[0]+1,len1);

   int len2 = size_pos[2] - size_pos[1] - 1;
   strCol = line.substr(size_pos[1]+1,len2);

   if (gotInteger(strRow,row))  // convert string to int
   {
      if (gotInteger(strCol,col))  // convert string to int
      {
         if (row > 1 && col > 1)
            return true;
         else
         {
            cout << "Puzzle size to small to play with: "
                 << row << " * " << col << endl;
            return false;
         }
      }
      else
      {
         cout << "Cannot parse Column number." << endl;
         return false;
      }
   }
   else
   {
      cout << "Cannot parse row number." << endl;
      return false;
   }
}

void convert2Upper(string& str)
{
   for (unsigned int i=0; i<str.size(); i++)
      str[i]=toupper(str[i]);
}

bool gotInteger(const string& str, int& i)
{
   istringstream ss(str);
   return ss >> i ? true : false;
}


Sample output for the above would be:

1

2
   Found size= at: 0  : This is a puzzle line: size=15,9
3

4
   Found searchFor= at: 0 : This is a puzzle line: searchfor=apple
5

6

7
   This is a puzzle line: appappleasdasqw
8
   This is a puzzle line: fsdgeagapplevqw
9
   This is a puzzle line: sdgebepsdfgpltr
10
   This is a puzzle line: aegfaegreargeio
11
   This is a puzzle line: aegfaegreargeio
12
   This is a puzzle line: onwpeoappledfui
13
   This is a puzzle line: afgergegregffvb
14
   This is a puzzle line: fsdgeagapplevqw
15
   This is a puzzle line: sdgebepsdfgpltr
16

17

Last edited on
are you making a script interpreter?

searchFor=apple # search for this word in the puzzle
i think you should check each character for each line, for example the comment here goes after the function, when you found the # character that's the time to call the getline() function to proceed to the next line of the file

uhm, that's my idea.

edit:
my idea is to read each line of the file at a time, then examine each character in that line

1. set a variable(let's call it strbuf) to hold the puzzle for each line

2. if the next character is a valid character add it to the strbuf else throw an error message

3. if the next character is a # , process what you have obtain in strbuf; then igore the next characters then move in to the next line of the file..

4. if the next character is a = , then from the left of it should be a function; do something about it.

5. repeat

~i hope that would help
Last edited on
yea not really, as you can see I have already made it so it can find the size and searchfor functions, the problem is when reading into puzzle line I'm dragging those in as well.

This is driving me crazy...
To make this even more complicated, if the input file should give me a size, then a puzzle, then another size, puzzle and search, the first puzzle needs to be disregarded and appropriate error message.

Even if I add some check like, if string[0]==s, and string[1]==i || string[1]==e, and string[2]==z || string[2]==a, and string[3]==e || string[3]==r, and string[4]=='=' || string[4]==c, and string[5]==h, and string[6]==f, and string[7]==o, and string[8]==r, and string[9]=='='

boolean variables can check if I have gotSize, and gotWord, and gotPuzzle. The thing I'm not entirely sure is how do I check, or know once I have reached the end of puzzle input. How/When do I set gotPuzzle?
this would be so much easier if I could put the find into a conditional:

1
2
3
4
5
6
7
if (line[0]=='#'||line=="")
   ...
else if (line.find("size=")
   ...
else if (line.find("searchfor=")
   ...
else  // line is puzzle line... 


I have boundries for the size, so maybe I need to add a counter for the puzzle line to only process n lines. But that still brings me back to my previous problem how to distinguish between size=, searchfor= and a puzzle line.

edit: keep in mind the word "size" or "search" may also be used a keyword search so just looking for those it maybe that I completely disregard huge chunks of the puzzle.
Last edited on
Ok at least that discussion prompted some thought process... what if I do it like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
tmp_line = line;
convert2Upper(tmp_line);

if (line=="" || line[0]=='#')
{
   // == Line is Comment == //
}
else if (line[0]=='S' && line[1]=='I' && line[2]=='Z' && line[3]=='E' && line[4]=='=' ) // size=
{
   // == Line is size, parse as normal and set value == //
}
else if (line[0]=='S' &&  line[1]=='E' && line[2]=='A' && line[3]=='R' && line[4]=='C' &&  line[5]=='H' && 
line[6]=='F' && line[7]=='O' && line[8]=='R' && line[9]=='=')
{
   // == line is word, parse as normal and set value == //
}
else
{
   // == line is definitely puzzle line == //
}


That to me seems like the only way I will be able to do it, with the else I'll make sure I have size, and only work in those bounds. substr each line n times up to " .#\t\n" and convert then once that is done I can check if I have gotSize and gotWord, with each of those (gotSize and gotWord) I'll also have to do a check to make sure they aren't already set to true, if they are I can output appropriate error message e.g. ":Error: lineCount: puzzle has already been given a size, starting next puzzle."
Last edited on
i guess you didn't get my idea.. though i admit it's my fault, my english is bad maybe..

the code you posted above is not what i have in mind,
I got your idea I just didn't think it would work as the puzzle can contain any char, except '.' whitespace and '#'... So that means it can include "=,&*()[]$@!^+-_{};:/\ a-z A-Z 0-9" etc

2. if the next character is a valid character add it to the strbuf else throw an error message

this means functions size= and searchfor= would get added to strbuf...

3. if the next character is a # , process what you have obtain in strbuf; then igore the next characters then move in to the next line of the file..

I pretty much do this in my functions parseSize() & parseWord() and when getting lines for puzzle, see the output in the first post is actually valid,(but lines 2 and 4 are WRONG, they are not part of puzzle) I had no problem ignoring comments, it was figuring out the diff between functions and puzzles. as realistically my puzzles can contain functions as chars... and my teacher being as pedantic as he is, will catch every and all errors and break our program in every way possible.


4. if the next character is a = , then from the left of it should be a function; do something about it.

This can however also be part of the puzzle.


Pretty much the only thing that can't be in the puzzle lines is size= and searchfor= , which I'm guessing is probably what he will try to do in one of the test subjects. But I can't see any way around this... I just have to accept this cannot be avoided, and the code is slightly flawed because of this.
Last edited on
this means functions size= and searchfor= would get added to strbuf...

yes, but if you found '=' then pop the word 'size' or 'searchfor' from strbuf

This can however also be part of the puzzle.

i didn't it was part of the puzzle, then if you found = then check if the word 'size' for 'searchfor' is in it's left


so you're problem is how to find the word 'apple'? is that right?
Couldn't you just read each line, using getline, erase anything including and beyond a '#', and then process the remaining line? The approach above seems to want to step through each line from left to right, which requires it to take all possible scenarios into account.
Yeah there's a few ways to do this I guess. I have pretty much figured it out now, onto the hard part... :s
so what's the main problem again?
solving the puzzle in every direction.
what do you mean by solving it? what's the rules?
I didn't mean to say you were wrong earlier simply that I couldn't see it working, but your right you could do it that way, you could also do it the way I have done, and the way moorecm had mentioned.
I know how I'm going to do this next part, basically the 2 functions are size, and searchfor, then we are given a puzzle. the size is the size of the char* it must be a dynamic char *. We are not allowed to use string or cstring class to solve the puzzle, we can however when parsing the input file. the word is the word to find in the puzzle in every direction north, ne, east, se, south, sw, west, nw. The output must show the input puzzle, then the solved puzzle using dots for all non-word chars, eg.

1
2
3
4
5
6
7
8
9
# hello
size=6,5

five*f
liedwi
nmvwrv
q^iece
bpf+jk
seaRchFoR=five


five.f
.ie..i
..v..v
..ie.e
..f...
Topic archived. No new replies allowed.