Lexical Analyzer

Apr 23, 2012 at 4:44pm
Hello all I am studying "Data structures and algorithms" these days at college. As per suggested by most senior fellows, I am thinking about proposing "lexical analyzer" as my term project. Although I am yet to study it in detail.

Before I propose my project can you guys please fill me up with all possible data structures which will be required to design this project.

And what is it's difficulty level at "Data structures and algorithms" level?

Also provide me with any useful resource/books if you got.
I'll do my research too, but it's always good to discuss with someone who knows the business inside out.

Looking forward to your response, many thanks.
Apr 23, 2012 at 6:30pm
How sophisticated do you intend to make this? Is it going to analyze a text file, or the syntax of something like code? This really all depends on how specific the parser would be and would in turn determine the data structures as well as difficulty level.

Are you wanting resources for how to design Lexical Analyzers in C++, or just general information about them?
Apr 23, 2012 at 6:48pm
all possible data structures


With something like a lexical analyzer, you can probably find useful applications for any of them.

You could look at the flex/flex++/flexc++ source code for inspiration.
Apr 23, 2012 at 7:30pm
cantide5ga I am looking forward to build a analyzer which analyzes C++ code for me but on efficient basis. I'll appreciate all sorts of resource regarding design as well as the general information thanks.

hanst99 Thanks for the reply, can you please mark it's difficulty level on a scale of 10. For a "Data structures and algorithms" student?
Apr 23, 2012 at 8:04pm
Lol. Good luck with that. To analyze C++ code? That would require a lot more knowledge than just data structures and algorithms.
Apr 23, 2012 at 8:09pm
Thanks for the reply, can you please mark it's difficulty level on a scale of 10. For a "Data structures and algorithms" student?


Nope, I can't. Building lexical analyzers for one specific task manually is usually a lot more busywork than actually being hard though.

Just to make sure - we are talking about the same kind of lexical analyzers here? The one that you feed some text and which then spits out a list of tokens?
Apr 23, 2012 at 8:13pm
@Vlykarye

Really? I mean, lots of symbols to parse, some with double meanings depending on use. Other then consuming a lot of time, I figured structs and algorithms is all there is to it.

How would you go about it?
Apr 23, 2012 at 8:26pm
I wouldn't go about it. I would have to agree with hanst here. I don't rightly know what a lexical analyzer is. But I bet he does, and from what he says, it sounds like a lot of work.
Apr 23, 2012 at 8:37pm
The OP is most likely speaking of parsing a document into tokens to check for syntactical correctness. It is tedious, but not necessarily difficult. Hopefully they get back.
Apr 23, 2012 at 8:37pm
http://en.wikipedia.org/wiki/Lexical_analysis

Essentially, a lexical analyzer would read something like this:

 
abc = 3 + i;


and spit out something like:


IDENTIFIER(abc), ASSIGNMENTOPERATOR, INTEGER(3), PLUSOPERATOR, IDENTIFIER(i)


(roughly. Of course it depends on what kind of tokens you actually want to generate).

EDIT: Just noticed, the example is redundant because the article has pretty much the same one.
Last edited on Apr 23, 2012 at 8:38pm
Apr 23, 2012 at 8:42pm
That's interesting. Are they used exclusively for documentation? Or do compilers use lexical analysis?

... perhaps that is a dumb question. Obviously compilers use the same idea, but what is the main difference between a lexical analyzer and syntax checker?
Last edited on Apr 23, 2012 at 8:43pm
Apr 23, 2012 at 8:44pm
You run a lexical analyzer to generate input for a parser (that eats a series of tokens and assigns meaning to it). So yep, lexical analysis is part of any compiler (or interpreter for that matter). Sometimes there is no strict distinction between the lexical analysis and the parsing, but I think in most larger systems it is made.
Last edited on Apr 23, 2012 at 8:45pm
Apr 23, 2012 at 8:46pm
Oh ok. So do people normally use this phrase when talking about parsers and compilers?
Apr 23, 2012 at 8:48pm
Ah. Alright. I have been pretty bad about remembering what terms are used for what. So I'm trying to break that habit. And now I know what this is. Sounds very useful. I feel ashamed on wanting to make a small language without knowing this stuff ><
Apr 23, 2012 at 8:52pm
Don't be. I personally have the habit of using 'lexer' and 'parser' very loosely myself.
Topic archived. No new replies allowed.