Hi,
I was wondering when finding something in an STL map if there would be a large performance difference when using an int vs string as a key type?
Could it be a benefit maybe to hash the string and then use that as the lookup key? If so, what would be a good hash function to use?
It also depends on how long the strings are. operator== for string in GCC isn't terribly efficient because
it calls string::compare(), which works like strcmp(), so it has extra work to do to determine less-than and
greater-than.
Probably it is more efficient to hash the strings since you only have to hash them once but you have
to compare a lot. The question is: is it worth it? I think the answer is that you should try the simplest
algorithm first (no hashing) and if that proves to be a bottleneck then optimize. Architected properly,
it should be possible to make this change to your code without much widespread impact to your
code.
Thanks kbw and jsmith for your responses. I'll follow your advice and leave the optimizing until the end.
However if you had to implement a hashing function for this purpose, do you know of a generic one that could be used to make sure hashes are always unique for each string value? Would that be possible, at least with a very low probability of a hash reoccurring for different string values?