correct, it does not validate. The smarter you make it, and the more complicated your language, the more work you will need to validate and interpret it. Allowing () adds complexity.
My answer would be "do not allow ()" or other double symbol tokens. If you insist, you need a counter so if you see ( paren++ and when you see ) paren-- if its not zero at the end of the line, its bad, and if it goes negative, its bad (unless you allow spanning lines, like {} in c++, then you probably will need something much more substantial than a one pass mindlessly simple letter crawler).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
|
int main()
{
string ex = "read(\"filename\")";
vector<string>toks;
bool delims[256]{false};
delims ['\"'] = delims['('] = delims[')'] =true;
int parenctr{};
string tmp;
for(char c : ex)
{
if(delims[c])
{
parenctr += (c=='(');
parenctr -= (c==')');
if(parenctr < 0 ) {cout << "error"; return 0;}
if(tmp.length())
{
toks.push_back(tmp);
if(parenctr) {cout << "error"; return 0;} //it should be zero here, if each line is self contained
}
tmp = "";
}
else tmp.push_back(c);
}
for( auto &a : toks)
cout << a << endl;
}
|
to which the observant coder might say "well now it accepts read("fubar);
... so you need a quote counter too, if you care. Windows accepts one quote in many commands, it just is a token that tells the cmd that a multi word string is coming.
it also accepts derpy () pairs, like (read()()(filename()))
which may be gibberish in your new language, or valid, but this overly simple parser will certainly take it.
if you want to get really fancy, this may be a place for regx.
if you want to just get it done, maybe you should use a compiler-compiler or similar tool.
if you want to do it yourself using string manipulation from ground zero up, AND you want it smart and rich, you need to read up on better techniques. I use a lot of really dumb, really simple parsers like above because I have been used to dealing with machine spew (eg a device that spits out json or xml or nmea etc) where the data is very unlikely to contain 'coding errors' so a simple run through it parser works.