Finding the most common character within a string

Oct 24, 2013 at 3:43pm
Hey guys, I have searched high and low for a solution to my problem and it is a little more than frustrating. I am trying to take a string that is within the main function, and write a void function that gives me the most common alpha character used inside the string.

I don't even know where to begin. I am not really asking for someone to really write the function, although it would be nice... But I am more looking for some insight and I guess where to start.

I keep reading about everyone using arrays but I am not sure how to mix a string and an array together like that as I am not too familiar with arrays yet.

Thanks in advance, any help is much appreciated!
Oct 24, 2013 at 3:51pm
Create array of counters for counting occurrences of every possible character, then run through string and do counting. After it is finished run through array and select character for which counter is maximal.

Here are a pair of links which could help:

the concept of array of counters:
http://codeabbey.com/index/task_view/array-counters

selecting max of array:
http://codeabbey.com/index/task_view/maximum-of-array
Oct 24, 2013 at 4:31pm
Thanks for the answer rodiongork. I am not too familiar with arrays, but those links help in understanding how they work.

The string is going to be a user input string and it is for a project I am working on.

I am so lost on where to even start because of my noob level when it comes to arrays... Maybe my brain is just on overload right now and I'll figure it out down the road? LOL
Oct 24, 2013 at 7:55pm
The idea is to have a list of list of how many times each character appears.

Given the string "hello world", each character appears:

a b c d e f g h i j k l m n o p q r s t u v w x y z
      1 1     1       1     1     1         1
                      2     2
                      3

Since an (unsigned) character is a value in 0..255, you only need a table of 256 values:

 
unsigned int counts[ 256 ] = { 0 };

No, for each (unsigned) character value, just bump the value of each index:

1
2
for (size_t n = 0; n < s.size(); n++)
  counts[ (unsigned char)s[ n ] ]++;

Now all you need to do is find the largest value in counts[]. The index of that value is the same as the character that appears most often in s.

Good luck!

[edit] BTW, the tutorial on arrays is a good place to start:
http://www.cplusplus.com/doc/tutorial/arrays/
Last edited on Oct 24, 2013 at 7:56pm
Oct 27, 2013 at 2:41pm
Hey guys... After a long night and hard research with a classmate, the solution has been found!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
char mostFrequent(string text)
{
    int max=0;
    int count=0;
    char maxCharcter;
    for(char q=' ';q<='~';q++)
        
    {
        count = 0;
        for(int i=0; i<text.length();i++)
        {
            if(text[i]==q)
                count++;
        }
        
        if(count>max)
        {
            max=count;
            maxCharcter=q;
        }
    }
    
    return maxCharcter;
    
}


This works great and it is the full ASCII range too.

My last question would have to be, how would I get it to display multiple characters? If I have more than one frequent character it displays the closest to the beginning of the ASCII table.
Oct 27, 2013 at 3:41pm
Reconsider my answer.
Nov 7, 2013 at 2:47am
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
string mostFrequent(string text)
{
    int max = 0;
    int count = 0;
    string maxCharacter;
    for(char q=' ';q<='~';q++)
        
    {
        count = 0;
        for(int i=0; i<text.length();i++)
        {
            if(text[i]==q)
                count++;
        }
        
        if(count == max)
        {
            maxCharacter += q;
        }
        
        if(count>max)
        {
            max=count;
            maxCharacter=q;
        }
    }
    
    return maxCharacter;
}


This displays most frequent character, and if there is a tie, it displays both.

Thanks for the help!
Nov 12, 2013 at 12:48am
This was really helpful for me but I guess I don't understand it fully. I've tried to manipulate it so that it would give me another frequency (2nd most frequent character, 3rd, etc). How would you edit this code to give you something other than the most frequent character? For this example let's say the 2nd most frequent character.
Nov 12, 2013 at 1:42am
Topic archived. No new replies allowed.