How do I find 10 most popular names from HUGE string array, it has like 850 names???this is a school project

void insertInMap(map<string, int>& m, string s){
if(m.find(s) == m.end()){
m[s] = 1;
}
else{
m[s]++;
}
}

void checkAns(map<string, int> &m, string s, int &maxi, string &ans){
if(m[s] > maxi){
maxi = m[s];
ans = s;
}
else if(m[s] == maxi){
if(s < ans){
maxi = m[s];
ans = s;
}
}
}

string Answer(string ans, map<string, int> m, int &rTotal, string Names[CMax])
{
int maxi = 0;

ans = "";


for(int i = 0; i < rTotal; i++)
{
insertInMap(m, *Names);
checkAns(m, *Names, maxi, ans);
}


return ans;
}

void MostPopularNames(string ans, map<string, int> m, string popularNames[10], string Names[CMax])
{
int sk = CMax;
string temporary;
int k = 0;
string* names = new string[sk];

for(int i = 0; i < sk; i++)
{
names[i] = Names[i];
//cout << names[i] << endl;
}

for(int i = 0; i < 10; i++)
{
popularNames[i] = Answer(ans, m, sk, names);

for(int j = 0; j < CMax; j++)
{
if(names[j] == popularNames[i])
{
temporary = names[j];
names[j] = names[sk-1];
//unimportantNames[k] = temporary;
k++;
/*-----------------------------------*/

int* resize_arr = new int[sk + 1];
memcpy(unimportantNames, names, sk);
sk--;
names = resize_arr;
delete[] resize_arr;
/*-----------------------------------*/
}
}

cout << i+1 << ". " << popularNames[i] << endl;
}
}
Last edited on
First, please use code tags when you post code. See http://www.cplusplus.com/articles/jEywvCM9/

What is the question?


PS. 850 is not HUGE. 10 terabytes of input data would be on the large side.
i mean for me its huge cuz its the first time im working with that kind of array size
:)
Plus the question is how to find 10 most popular names from string array
Last edited on
Please follow what keskiverto said in his first sentence by reading the link. You can edit your first post to add them.

Plus the question is how to find 10 most popular names from string array
You appear to be putting this information into a std::map, so I assume you want to then get the 10 highest names out of the std::map, compared through the number of occurrences for the name. What makes this difficult computationally is that an std::map is sorted by its key, not its value, so it would be more efficient to compute the top-10 list as you populate the map. Nevertheless, the following will do this, although it isn't the most efficient way, probably:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
// Example program
#include <iostream>
#include <string>
#include <map>
#include <vector>
#include <utility>
#include <algorithm>
using std::string;
using std::map;
using std::vector;
using std::pair;

struct NamePopularityComparer
{
    // Implements logic for "greater than" for your std::map,
    // ordering it from greatest (most popular) to least (least popular).
    bool operator()(const pair<string, int>& name_A,
                    const pair<string, int>& name_B)
    {
        return (name_A.second > name_B.second);
    }
};

vector<pair<string,int>> top_names(const map<string, int>& name_popularity, const int N)
{
    vector<pair<string, int>> top_n_names(N);
    
    // https://en.cppreference.com/w/cpp/algorithm/partial_sort_copy
    // http://www.cplusplus.com/reference/map/map/begin/
    std::partial_sort_copy(name_popularity.begin(), name_popularity.end(), // source
                           top_n_names.begin(), top_n_names.end(), // destination
                           NamePopularityComparer()); // compare method
                           
    return top_n_names;
}

int main()
{
    map<string, int> name_popularity = {
        { "Alice", 11 },
        { "Bob",  5 },
        { "Eve", 1 },
        { "Frank", 6 },
        { "Charlie", 7 }
    };
    
    // selects and sorts 3 names
    vector<pair<string,int>> top_3_names = top_names(name_popularity, 3);
    for (int i = 0; i < 3; i++)
    {
        std::cout << top_3_names[i].first << " " << top_3_names[i].second  << '\n';   
    }
}
(The iterator to a map<T, U> points to its corresponding pair<T, U> object)

Output:
Alice 11
Charlie 7
Frank 6


___________________________________________

If you don't want to use std::partial_sort_copy, then you would do the following pseudo-code:
given map<string, int>
copy elements into vector<pair<string, int>> or equivalent
sort elements by the pair.second (the int) using std::sort with custom comparer
print top-N elements after sorting
Last edited on
{ "Alice", 11 },
{ "Jack",  5 },
{ "Eve", 1 },
{ "Frank", 5 },
{ "Charlie", 7 }

What is the "top 3" in the above?
Is it {Alice, Charlie, Frank},
{Alice, Charlie, Jack},
{Alice, Charlie, Frank, Jack}?,
or what?
Those are all valid answers for different reasons. The giver of the task should speficy what they expect in such special case, and then you code to make it happen.
Here's what trips people up with this:
- when reading the data, you want to look it up by string to increment the count.
- when sorting the data, you want to sort by count.

With UNIX command line tools, it's a one-liner:
sort | uniq -c | sort -n | tail -10


A simple way to do it in C++ would be:
- read the data into a map<string,int> like you are now to get the counts.
- go through the map and insert into a multi_map<int, string>.
- print the last 10 entries in the multi_map.
Im not using the map because i wanna. Im using map because i tried to apply an example of given code to my program. Okay i will explain what i need to do. Teacher gave me a text file with whole school's students names and he asked me to make a program finds top 10 most frequently used student's names in school. And I dunno how to do it. Ganado, i think that might be a good solution if I just edited your code a bit, but i dont know why your given code of declaration of map in main() is not working. But thank you for your answer. Dhayden, could you explain your way of doing a solution a little bit more in detail. Thanks you all! :) I'm new in cplusplus forums so sorry if I did something wrong :)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <fstream>
#include <map>
#include <string>
#include <algorithm>
using namespace std;

using PR = pair<string,int>;

int main()
{
   map<string,int> M;
   ifstream in( "names.dat" );
   for ( string name; in >> name; ) M[name]++;

   for ( int i = 1; i <= 10 && !M.empty(); i++ )
   {
      auto it = max_element( M.begin(), M.end(), []( PR a, PR b ){ return a.second < b.second || ( a.second == b.second && a.first > b.first ); } );  // deliberate >
      cout << (*it).first << ' ' << (*it).second << '\n';
      M.erase( it );
   }
}

but i dont know why your given code of declaration of map in main() is not working.
I dunno, probably need to compile in at least C++11, if not C++14. How old is your compiler? (If it's GCC, try adding -std=c++11 or -std=c++14 to your compile flags.)

You can change the code to just call ["name"] = count a few times instead. It's just an example to fill the map will information, which you already did in your actual code by the looks of it.
Last edited on
Thank you all for help! You all really helped me! :)
Here's what I meant.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include <iostream>
#include <map>
#include <string>

using std::map;
using std::multimap;
using std::string;
using std::cin;
using std::cout;

int main()
{
    map<string, unsigned> name2Count;
    multimap<unsigned, string> count2Name;
    string name;

    // Read the names and increment the counts
    while (cin >> name) {
	++name2Count[name];	// inserts name in map if not already there
    }
    
    // Now go through the name->count map and insert into the
    // count->name map. This is just to get them sorted by count
    for (auto & item : name2Count) {
	multimap<unsigned, string>::value_type val(item.second, item.first);
	count2Name.insert(val);
    }

    // Pull off the last 10 items. Notice that I'm using a reverse iterator
    // (rbegin() and rend()) to start at the end and go backwards.
    // Also the loop can exit if you run out of items or if you print 10
    unsigned count = 0;
    for (auto iter = count2Name.rbegin(); iter != count2Name.rend(); ++iter) {
	cout << iter->second << '\t' << iter->first << '\n';
	if (++count >= 10) break;
    }
}

Topic archived. No new replies allowed.