Map iteration

Feb 10, 2014 at 3:17pm

Good evening, and thank you for your time,
I am very junior (college level), and I seem to have problems iterating through
a map that I have to use in my project(lecturer tells me he isn't allowed to help even though it is not covered in the material).

The map structure looks as follows:

map <string, map <string, aClass> > myMap;

aClass exists of a 2 strings and a double.

I was wondering what the iterator should look like for the inner loop (the second map <string, aClass>, and how exactly i would refrence the data in aClass.

Thank you again.

Feb 10, 2014 at 4:00pm

JLBorges (13770)

#include <iostream>
#include <string>
#include <map>

struct aClass { std::string a, b ; double d ; };

int main()
{
    std::map< std::string, std::map< std::string, aClass > > myMap = 
    {
        {
            "abc",
            {
                { "defg", { "one", "two", 1.2 } },
                { "hijk", { "three", "four", 3.4 } },
                { "lmno", { "five", "six", 5.6 } }
            }
        },
        
        {
            "pqr",
            {
                { "stuv", { "seven", "eight", 7.8 } },
                { "wxyz", { "nine", "zero", 9.0 } }
            }
        }
    };
    
    // iterate using range-based loop
    {
        for( const auto& sm_pair : myMap )
        {
            std::cout << sm_pair.first << '\n' ;
            for( const auto& sc_pair : sm_pair.second )
            {
                std::cout << "     " << sc_pair.first << '{' << sc_pair.second.a << ',' 
                          << sc_pair.second.b << ',' << sc_pair.second.d << "}\n" ;    
            }
        }
    }
    
    std::cout << "----------------------\n" ;
    
    // iterate C++98 style
    {
        typedef std::map< std::string, std::map< std::string, aClass > >::iterator outer_iterator ;
        typedef std::map< std::string, aClass >::iterator inner_iterator ;
        
        for( outer_iterator outer = myMap.begin() ; outer != myMap.end() ; ++outer )
        {
            std::cout << outer->first << '\n' ;
            for( inner_iterator inner = outer->second.begin() ; inner != outer->second.end() ; ++inner )
            {
                std::cout << "     " << inner->first << '{' << inner->second.a << ',' 
                          << inner->second.b << ',' << inner->second.d << "}\n" ;    
            }
        }
    }
}

Edit & run on cpp.sh

http://coliru.stacked-crooked.com/a/3f83c8a31fdea1ca

Feb 10, 2014 at 4:04pm

coder777 (8447)

The iterator data type is pair. See:

http://www.cplusplus.com/reference/utility/pair/

example:

myMap["foo"]["bar"] = aClass();

...

map <string, map <string, aClass> >::iterator it = myMap.begin();
if(it != myMap.end())
{
  if(it->first == "foo")
  {
    map <string, aClass>::iterator it2 = it->second.begin();
    if(it2 != it->second.end())
    {
      if(it2->first == "bar")
        it2->second.FunctionCallOfaClass();
    }
  }
}

Feb 10, 2014 at 6:52pm

th3c0t0nb4ll (2)

Thank you guys for the time and effort, I appreciate it greatly.

Feb 10, 2014 at 7:09pm

IWishIKnew (1364)

A map is a list that can have a user-defined key type, and user defined data type, as opposed to an integer type and a user-defined data type.

Example:

//you could make the key_type a string, vector<>, etc... if you wanted to, it would work the same.
typedef unsigned int key_type;

typedef std::string data_type;
map<key_type, data_type> a_new_map;

You reference it using the key type. In your usual arrays and vectors, you use an integer, or unsigned integer. But with a map, you can use a string, vector, or any other data type with the appropriate operator (in this case, operator==() and operator=()) member function.

Example:

struct user_info{
//arbitrary user information...  just as an example
};

map<std::string, user_info> database({});
user_info user;
//set user to whatever...
database["John"] = user; //add the name and the information

//now we want to retrieve information...
if(database.find("John") != database.end())
{
    //display information
    user = database["John"];
}

This is useful for finding information quickly. I did a comparison between maps and iterating through a vector. The difference is very noticable: vector of 200,000 strings took 20 minutes to complete iteration, while the map only took 5-7 minutes. BIIIGG difference!

This is no substitute for hashing though, so keep in mind that if you require a reliable and effective hash, don't use maps!

So, how do we iterate through a STL object? Well, like we do all STL objects: we declare it's iterator. Objects that I commonly use iterators for are std::strings, std::vectors, and std::maps. There are many others, but these I use most often.

How do we declare an iterator though? Well, an iterator is declared by calling it through it's containing object.

Example:

std::string temps("HELLO WORLD!");
for(std::string::iterator it = temps.begin(); it != temps.end(); it++)
{
    cout<< *it;
}

If you iterate through an object, you should use the .end() and .begin() member functions of that object. These functions return the beginning iterator, and end iterator respectively. The begin iterator returns an iterator pointing to the first valid index. The end iterator usually points to nothing (or NULL), and is returned when .find() does not find any matches.

For templated objects, you will have to give the template type. Lists, maps and vectors are templated objects.

For example:

std::vector<int> alist({});
for(short x = 0; x < 100; x++) alist.push_back(x);

for(std::vector<int>::iterator it = alist.begin(); it != alist.end(); it++)
{
    cout<< *it<< endl;
}

Iterators are objects too. You can declare them anywhere, but they have to be of the type you declare them:

std::string s("Hello World");
std::vector<char> vec({'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd'});

std::vector<char>::iterator v_it(vec.begin());

/* WRONG WAY */
for(; v_it != s.end(); it++); //v_it is a vector iterator, not a string iterator!

/* Right way: */
for(; v_it != vec.end(); it++)
{
    cout<< *it;
}

So, why use iterators? Because they provide consistency and provide a base for some algorithms. For some STL objects they are also more efficient than indices! It depends on the object you use the iterator for, and what you do with the iterator.

Best of luck.

Last edited on Feb 11, 2014 at 10:41pm

Feb 10, 2014 at 10:53pm

naraku9333 (2163)

@IWishIKnew
Sorry to nit pick, but std::map does not (AFAIK) do any hashing. std::unordered_[map|set] and std::unorded_multi[map|set] are hashing containers. std::map and std::set are (usually) implemented as binary trees.

The difference is very noticable: vector of 200,000 strings took 20 minutes to complete iteration, while the map only took 5-7 minutes.

I'm curious how you tested this, I can't see either container taking that long to iterate 200,000 elements.

Feb 11, 2014 at 1:44am

JLBorges (13770)

> The difference is very noticable: vector of 200,000 strings took 20 minutes to complete iteration,
> while the map only took 5-7 minutes.

>> I'm curious how you tested this, I can't see either container taking that long to iterate 200,000 elements.

I had some time to kill, so I decided to actually measure it on coliru.

200,000 is laughably small, so tests were done with 32 million random strings on a std::vector<std::string> and 2 million random strings on a std::set<std::string>. In both cases, average string size was 12 characters.

Summary: the vector is about 16 times faster to iterate over compared to a set (locality of reference)

vector size: 32000000
processor time taken (iterate): 0.25 seconds

set size: 2000000
processor time taken (iterate): 0.26 seconds

Note: iterating over std::map<std::string,X> would be slightly slower than iterating over a std::set<std::string>

The code and complete output:
vector http://coliru.stacked-crooked.com/a/3f7ffd28ea18cf5b
set http://coliru.stacked-crooked.com/a/e097f290bb7b5ae2

Feb 11, 2014 at 4:54pm

IWishIKnew (1364)

@Naraku
I never said maps were hashes. To a novice, they appear to be very fast, almost like hashes. That's why I said they were no substitute for hashes.

-------------------------------------

On the note of how I tested the 200,000 iterations:

I did not just iterate through. This was a test to see the difference in efficiency between iterators and the map<>.find() function.

To do this, I had 2 vectors of 200,000 strings, and 2 maps of 200,000 strings. For both the maps and the vectors, I iterated through all of the elements, "searching" for each one in it's counterpart. Basically, I compared the two containers.

Since vectors don't have a .find() function, I had to iterate through each element to find somthing.

With maps, it was a simple matter of .find().

example:

//vectors:
vector<string> list1(make_list()), list2(make_list());
for(vector<string>::const_iterator it = list1.begin(); it != list1.end(); ++it)
{
     for(vector<string>::const_iterator it2 = list2.begin(); it2 != list2.end(); ++it2)
     {
          if(*it == *it2)
          {
               //do somthing, whatever
               it2 = list2.end(); //break.  We found the element
          }
     }
}

The best efficiency would be factorial(200,000) (because you can't assume both lists are the same, or that they're in any order). That's horrendous!.

//maps
map<string, unsigned int> list1(make_list()), list2(make_list());
for(map<string, unsigned int>::const_iterator it = list1.begin(); it != list1.end(); ++it)
{
     if(list2.find(it->first))
     {
          //do somthing, whatever
     }
}

In this setup, maps performed significantly better than vectors, because we don't have to iterate to find the specified element.

Last edited on Feb 11, 2014 at 5:00pm

Feb 11, 2014 at 7:27pm

naraku9333 (2163)

I never said maps were hashes.

A map is a hashed list ...

Since vectors don't have a .find() function, I had to iterate through each element to find somthing.

I think using std::find on the vector would be more of a fair test, or even better std::sort and std::binary_search.

Feb 11, 2014 at 7:40pm

JLBorges (13770)

Test: Populate a container with 200,000 strings.
And then search for every single one of of those 200,00 strings.
Average string length: 12 characters.

First the results:

vector
---------------------------------
          #found: 200000 strings
time to populate: 0.110 secs
    time to find: 0.180 secs
      time total: 0.290 secs


set
---------------------------------
          #found: 200000 strings
time to populate: 0.250 secs
    time to find: 0.200 secs
      time total: 0.450 secs


unordered_set
---------------------------------
          #found: 200000 strings
time to populate: 0.150 secs
    time to find: 0.070 secs
      time total: 0.220 secs

Then, the code:

#include <iostream>
#include <vector>
#include <set>
#include <unordered_set>
#include <cstdlib>
#include <ctime>
#include <string>
#include <algorithm>

void print_results( std::size_t nfound, std::clock_t begin_populate, std::clock_t end_populate, std::clock_t end_find )
{
    constexpr double clocks_per_sec = CLOCKS_PER_SEC ;
    std::cout << "---------------------------------\n"
              << "          #found: " << nfound << " strings\n"
              << "time to populate: " << ( end_populate - begin_populate ) / clocks_per_sec << " secs\n"
              << "    time to find: " << ( end_find - end_populate ) / clocks_per_sec << " secs\n"
              << "      time total: " << ( end_find - begin_populate ) / clocks_per_sec << " secs\n\n\n" ;
}

int main()
{
    constexpr char alphabet[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
                                "1234567890!@#$%^&*()_+`~[]{};':/?,.<>" ;
    constexpr std::size_t N = sizeof(alphabet) - 1 ;

    constexpr std::size_t MINSZ = 4 ;
    constexpr std::size_t MAXSZ = 20 ;
    constexpr std::size_t TESTSZ = 1000 * 200 ;

    std::srand( std::time(0) ) ;
    std::cout << std::fixed ;
    std::cout.precision(3) ;

    // prepare test data set of 200,000 strings (average size: 12 characters)
    std::vector<std::string> test_data_set ;
    test_data_set.reserve(TESTSZ) ;
    std::string str ;
    while( test_data_set.size() < TESTSZ )
    {
        str.resize( std::rand() % (MAXSZ-MINSZ+1) + MINSZ ) ;
        for( char& c : str ) c = alphabet[ std::rand() % N ] ;
        test_data_set.push_back( std::move(str) ) ;
    }

    // insert into vector, and then find every string
    {
        std::vector<std::string> vector ;
        vector.reserve( test_data_set.size() ) ;

        auto begin_populate = clock() ;
        for( const std::string& s : test_data_set ) vector.push_back(s) ;
        std::sort( vector.begin(), vector.end() ) ;
        auto end_populate = std::clock() ;

        std::size_t nfound = 0 ;
        for( const std::string& s : test_data_set )
        {
            auto iter = std::lower_bound( vector.begin(), vector.end(), s ) ;
            // check if we actually found the string by an equality comparison
            if( iter != vector.end() && *iter == s ) ++nfound ;
        }
        auto end_find = std::clock() ;

        std::cout << "vector\n" ;
        print_results( nfound, begin_populate, end_populate, end_find ) ; 
    }

    // insert into set, and then find every string
    {
        std::set<std::string> set ;

        auto begin_populate = clock() ;
        for( const std::string& s : test_data_set ) set.insert(s) ;
        auto end_populate = std::clock() ;

        std::size_t nfound = 0 ;
        for( const std::string& s : test_data_set )
        {
            auto iter = set.find(s) ;
            if( iter != set.end() ) ++nfound ;
        }
        auto end_find = std::clock() ;

        std::cout << "set\n" ;
        print_results( nfound, begin_populate, end_populate, end_find ) ; 
    }

    // insert into unordered set, and then find every string
    {
        std::unordered_set<std::string> set ;

        auto begin_populate = clock() ;
        for( const std::string& s : test_data_set ) set.insert(s) ;
        auto end_populate = std::clock() ;

        std::size_t nfound = 0 ;
        for( const std::string& s : test_data_set )
        {
            auto iter = set.find(s) ;
            if( iter != set.end() ) ++nfound ;
        }
        auto end_find = std::clock() ;

        std::cout << "unordered_set\n" ;
        print_results( nfound, begin_populate, end_populate, end_find ) ; 
    }
}

Edit & run on cpp.sh

Finally the link: http://coliru.stacked-crooked.com/a/d8541bbc089c0569

Feb 11, 2014 at 10:39pm

IWishIKnew (1364)

Oh, mabey i did call it a hashed list..

Anyway,I didn't know that std::vector had a find function. In any case, .find is still quicker than iterating through each element to find something.

Last edited on Feb 11, 2014 at 10:40pm

Feb 12, 2014 at 12:36am

naraku9333 (2163)

Anyway,I didn't know that std::vector had a find function.

It doesn't, I was referring to http://www.cplusplus.com/reference/algorithm/find/

In any case, .find is still quicker than iterating through each element to find something.

http://ideone.com/ZChzNG

Feb 12, 2014 at 4:45am

JLBorges (13770)

std::binary_search() does not have functionality equivalent to std::find() or the find() member of an associative container. std::find() locates the searched for element and returns an iterator that can be used to access the element that was found. std::binary_search() merely reports if the searched for element exists; it returns either true or false.

The binary search algorithm which gives equivalent functionality in logarithmic time is std::lower_bound()

Feb 12, 2014 at 5:05am

naraku9333 (2163)

@JLBorges
You are right, I should demonstrate equivalent functionality. Here is my previous example edited to use std::lower_bound and the size of the data increased to 2,000,000 strings. http://ideone.com/dCBnL3

Feb 12, 2014 at 6:16am

JLBorges (13770)

Essentially, this is the difference:

1
2

auto iter = set.find(s) ;
if( iter != set.end() ) { /* found; use iter to access the element */ }

vs.

1
2
3

auto iter = std::lower_bound( vector.begin(), vector.end(), s ) ;
// check if we actually found the string by an equality comparison
if( iter != vector.end() && *iter == s ) { /* found; use iter  to access the element*/ }

Topic archived. No new replies allowed.