To use regex or regular string functions/strtok?

Hello,

I have been researching this past week in my spare time for a pet project (I am a university student trying to learn on my own). I have a system in which I want to be able to type a text string and have my program decide what to do with it. Think typing "Computer, open Microsoft Word," and the computer will then open Microsoft Word.

All I will need to do is take a text string (maybe sometimes quite long strings) and be able to pick out that I address my system with the word "Computer" and be able to pick out pieces, such as "Computer turn the volume up to 99", and to change my volume to 99%. I'd like to make it more robust so that I could take an entire essay and be able to pick out what I'm looking for and have it all work quite quickly.

SO, the real question is, should I use regular expressions? Or should I use things like strtok and tokenize it all and use a decision tree to do as I wish?

If Regex, would you recommend boost regex++? Or should I go pcre,GNU regex? What is best suited? I have read about them all and nothing seems to give definitive pros and cons.

Any help is much appreciated!
Use regex if you're going to search for (non-trivial) patterns in strings.

But it sound like you need to start off by tokenizing (and then use a decision tree). This should be easy enough to do, but I wouldn't used strtok as it's evil (it destroys the input data). Either write your own, or use Boost.Tokenizer
Last edited on
Thanks for the reply!

So you're saying it would be best to tokenize the input strings before I apply regex?

I may be misunderstanding, or are you saying to try tokenizing first? From what I understood from Regex the input should be a string that isn't tokenized, correct?

Also, at what point would you say a pattern is non-trivial?

Thanks again!
Well, trivial cases are thing like substrings. I would not use regex in this kind of case, just string::find, etc (I have encountered code using regex for this purpose, but it is too costly to justify)

I don't think regex should even figure in your code. First you tokenize. Then you look up the tokens (i.e. words) in your dictionary to see if they're commands, and if they are you act accordingingly. I would have thought string comparison would be what's needed here.

Do you need to actually pattern match? If so, what pattern do you need to match?

There is not restriction on whether you feed a whole sentence or a single work/token to regex. It depends on what you're trying to match.
Last edited on
Topic archived. No new replies allowed.