C++ map VS Java hash map

Nov 22, 2010 at 3:42pm
Hi, folks:

My friend and I have HUGE amount of text data to deal with.

Now we decide to read in the data and store them in map for easy access in the program (array or vector is really not an option, too slow).

My friend uses hash map in Java while I use map in C++. I guess Java's Hash map has time complexity of O(1), right? In C++, red-black tree based map has time complexity of O(logN).

A test shows my program runs significantly more slowly than his.

Is it possible that C++'s map is the reason?

If yes, is there something in C++ that can compete? Ideally, STL should provide support and any examples of use are greatly appreciated.
Nov 22, 2010 at 3:55pm
std::map is certainly not what you want since it sorts its content. But there is a std::hash_map

http://www.sgi.com/tech/stl/hash_map.html
Nov 22, 2010 at 7:16pm
.. except it is not standard.

Better use std::tr1::unordered_map from Boost. It is going to be included into the STL, when C++0x finally comes out.

Last edited on Nov 22, 2010 at 7:19pm
Nov 23, 2010 at 2:11am

A test shows my program runs significantly more slowly than his.


Actually I have the chance to test Java map vs C++ map with 1.2 million entries in them. The performance is similar. The difference is Java map consume much more memory than C++ map.

So your implementation for Java and C++ must be "corresponding" to make a clear judgement on their performance respectively.
Nov 23, 2010 at 3:59pm

Actually I have the chance to test Java map vs C++ map with 1.2 million entries in them. The performance is similar


Depends on how you test them. For such big collections memory locality plays a huge role, and they have much different memory locality characteristics.
Dec 6, 2010 at 3:52pm
To all: thank you guys, the problem has been solved. I switched to hash_map and the performance improved by almost 50%.

To sohguanh: Thanks for the testing. I haven't tried that, but it sounds interesting, since their implementations seem to have different theoretical time complexity.

To rapidcoder: I second the idea.
Topic archived. No new replies allowed.