STL Map: string vs int as key type

Oct 16, 2009 at 10:11am
Hi,
I was wondering when finding something in an STL map if there would be a large performance difference when using an int vs string as a key type?
Could it be a benefit maybe to hash the string and then use that as the lookup key? If so, what would be a good hash function to use?
Oct 16, 2009 at 10:59am
It depends on the size of the map, and the number of comparisons you expect to make. It may or may not be worth the extra hasle.
Oct 16, 2009 at 11:34am
Worst case I'm thinking 10000 entries.
Oct 16, 2009 at 11:54am
I wouldn't worry about it yet. If it turns out to be a perfomance bottleneck then I'd take another look.
Oct 16, 2009 at 11:59am
It also depends on how long the strings are. operator== for string in GCC isn't terribly efficient because
it calls string::compare(), which works like strcmp(), so it has extra work to do to determine less-than and
greater-than.

Probably it is more efficient to hash the strings since you only have to hash them once but you have
to compare a lot. The question is: is it worth it? I think the answer is that you should try the simplest
algorithm first (no hashing) and if that proves to be a bottleneck then optimize. Architected properly,
it should be possible to make this change to your code without much widespread impact to your
code.
Oct 16, 2009 at 12:18pm
Thanks kbw and jsmith for your responses. I'll follow your advice and leave the optimizing until the end.
However if you had to implement a hashing function for this purpose, do you know of a generic one that could be used to make sure hashes are always unique for each string value? Would that be possible, at least with a very low probability of a hash reoccurring for different string values?
Oct 16, 2009 at 12:47pm
Oct 16, 2009 at 3:03pm
I think boost has a variety of CRC algorithms you could use.

Or you could try SHA-1 or even MD5 if security is not a concern.

None are "perfect".
Topic archived. No new replies allowed.