Hi, so I have this homework (and I don't want you to solve it for me, just need some hints):
I'm given an ASCII file and I have to parse it, check for validity and then make some simple manipulations with the tokens.
The correct format is
|
teacher: subject_1, subject_2, subject_n
|
with arbitrary number of whitespaces between the tokens and separators. And to make it a little more complicated, there are also comment lines, which have to start with a # sign (comments can't start with whitespace).
For instance, all these lines are incorrect:
1 2 3 4 5 6 7
|
john math, history
john: math hist ory
john:: math, history
john: math: history
john: math,,history
: john, math, history
# comment
|
while for instance these lines are correct:
1 2 3
|
john : math ,history
#comment
# co mm ent
|
(All the names and subjects are of course completely arbitrary alphanumerical strings without whitespaces).
I have no problem with what I have to do with the data after I parse it, it's the parsing and checking for validity that I find hard. So far I read line by line with readline(ifstream, line) and then go through each line char by char with a for loop, keeping lots of auxiliary variables along the way. It's almost 200 lines of terrible spaghetti code and the worst part is that it works correctly only for about 85% of inputs and it's very hard to find bugs in it (because it's so terrible).
I know this is something I need to seriously improve in - I've had some trouble with parsing and checking for validity in some of the previous homework as well, but it's getting progressively more painful.
So, what I really need are some hints - are there any functions that could help? (I can only use the standard library of c++98). Are there any algorithms that could be helpful? Any resources/pieces of good code I could read for inspiration?
Thank you.