C++ getting better at parsing more complicated text input

Hi, so I have this homework (and I don't want you to solve it for me, just need some hints):
I'm given an ASCII file and I have to parse it, check for validity and then make some simple manipulations with the tokens.
The correct format is
 
teacher: subject_1, subject_2, subject_n

with arbitrary number of whitespaces between the tokens and separators. And to make it a little more complicated, there are also comment lines, which have to start with a # sign (comments can't start with whitespace).
For instance, all these lines are incorrect:
1
2
3
4
5
6
7
john math, history
john: math hist ory
john:: math, history
john: math: history
john: math,,history
: john, math, history
   # comment 

while for instance these lines are correct:
1
2
3
  john   :   math    ,history       
#comment
# co  mm ent 

(All the names and subjects are of course completely arbitrary alphanumerical strings without whitespaces).

I have no problem with what I have to do with the data after I parse it, it's the parsing and checking for validity that I find hard. So far I read line by line with readline(ifstream, line) and then go through each line char by char with a for loop, keeping lots of auxiliary variables along the way. It's almost 200 lines of terrible spaghetti code and the worst part is that it works correctly only for about 85% of inputs and it's very hard to find bugs in it (because it's so terrible).

I know this is something I need to seriously improve in - I've had some trouble with parsing and checking for validity in some of the previous homework as well, but it's getting progressively more painful.
So, what I really need are some hints - are there any functions that could help? (I can only use the standard library of c++98). Are there any algorithms that could be helpful? Any resources/pieces of good code I could read for inspiration?

Thank you.
Write a function that splits up the read line.
Use string stream and string to do the splitting. that way the you dont have to work character by character, only on vectors (or like) container having a few words in it

Use of C++98 only, is this restriction put by someone or is it yourself?
Most C++11 capable compilers are free of cost, express editions of Visual Studio on Windows and GCC, clang on UNIX like OSs.
Topic archived. No new replies allowed.