how to code a tokenizer(for shunting yard algorithm)

how to code a tokenizer? any tutorials that explain the nitty-gritty of making one?
The wiki article gives an example in pseudo code. It should be straightforward to implement it in C++. You basically need to know about std::string and std::stack.
https://en.wikipedia.org/wiki/Shunting-yard_algorithm#Detailed_example
@thmm not exactly what i want, i need the tokeniser not the algorithm
If you know the algorithm you can code it yourself.
If you look for a ready made one just ask google.
i know the algorithm, not a way to turn a string into tokens
It depends how the tokens are separated. Assume space you can do like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <iostream>
#include <string>
#include <sstream>

int main()
{
    std::string input; 
    std::getline(std::cin, input);
    std::istringstream iss(input);
    std::string token;
    while(iss >> token)
    {
      // do sth.with token, i.e. convert to integer....
      std::cout << token << "\n";
    }
}
Before you tokenise, you need to know the syntax. What are valid tokens? What (if any) white-space is allowed and when? If tokens do not have to be separated by white-space (eg is 12+456 legal as opposed to 12 + 456 where white-space is used as a delimiter) then often you tend to use a loop inspecting each char in turn and process as needed. If white-space is mandated then something as simple as thmm's code above might suffice. Are you just dealing with numbers, operators and brackets?

In terms of actual C++ code, there's loads of examples on the Internet so I won't include any code here. Do some research.

Why do you want to do this? As part of a course? For homework? For your own interest? What resources re parsing/expression evaluation etc are you currently using? Are you studying Shunting Yard specifically or infix expressions in general - as there are more than one way to evaluate an expression.
Last edited on
@seeplus for fun, shunting yard (since it the easiest)
Topic archived. No new replies allowed.