counting common char

Forum

Forum
General C++ Programming
counting common char

counting common char

rguy2001 (2)

#include <fstream>
#include <iostream>
#include <string>
#include <cctype>

using namespace std;

void initialize(int&, int[]);
void lineRead(ifstream&, char&, int[]);
void count(char, int[]);
void print(int, int[]);

void initialize(int& loc, int list[])
{
loc = 0;
for (int i = 0; i < 26; i++)
list[i] = 0;

}

void lineRead(ifstream& in, char& ch, int list[])
{
while (ch != '\n')
{
count(ch, list);
in.get(ch);
}
}

void count(char ch, int list[])
{
ch = toupper(ch);
int index = static_cast<int>(ch)-static_cast<int>('A');
if (0 <= index && index < 26)
list[index]++;
}

int main()
{
int letter[26];
char ch;
int line;

fstream f("C:\letter_count.txt", fstream::in);

string mostFrequent(string text)
{
int max = 0;
int count = 0;
string maxCharacter;
for (char q = ' '; q <= '~'; q++)

{
count = 0;
for (int i = 0; i<text.length(); i++)
{
if (text[i] == q)
count++;
}

if (count == max)
{
maxCharacter += q;
}

if (count>max)
{
max = count;
maxCharacter = q;
}
}

return maxCharacter;
}
cout <<"The most common letter is %c with = %d occurrences", maxChar, maxCount);

}

Need help find the most common character from the text. My code results with lots of error and I can't figure this out. Please help

Duthomhas (13212)

If your code is confusing you then it is wrong. I recommend you start over.

The "most common character" is the one that appears more than any other character. The only way you can know that is by counting it.

The thing you use to count the number of times things appear is called a histogram. A histogram is simply a list of things and the number of times they appear.

For example, in:

"Hello everyone!"

We get the following histogram:
' ' 1 '!' 1 'H' 1 'e' 4 'l' 2 'n' 1 'o' 2 'r' 1 'v' 1 'y' 1
The largest count is 4, so 'e' is the most common character in the given string.
You can also see that there may be ties -- there might not be only one most common character.

The hard part is actually maintaining a histogram for any random data. For a character, you could easily use an array of 128 elements (one for each character), all initialized to zero. Then the character itself is an index into the histogram.

int histogram[ 128 ] = { 0 };

for each char c in string s
{
  histogram[c] += 1;
}

Another way to do it is to simply sort your string. For example, our string above sorts to:

" !Heeeellnoorvy"

We can see that the longest substring of equal values is 'e' -- there are four in a row. Finding the longest substring is very much like your 'find the minimum or maximum of an array' homework. You might find this method a little more difficult than the simple array histogram, so I recommend you use the first method, but I present it here so that you can see that there is more than one way to do it.

Hope this helps.

rguy2001 (2)

I have not used the method mentioned above, but I need the program to read a text file and all characters excluding space and find the most common character in the text.

sample output

the most frequent character is: I

Duthomhas (13212)

Well, unless you use a histogram of some kind (such as an array of char as I suggested or a std::map, etc, or indirectly as the sort would do) then you are not actually counting the data. And unless you count your data you cannot know which is most frequent.

There really is no way around it.

Dealing with a file is no different than dealing with a string; the characters are simply coming from a file instead of a string:

int histogram[ 128 ] = { 0 };

for each char c in file f
{
  histogram[c] += 1;
}

Oh, and JSYK, some loser is reporting all my posts, so you can safely ignore that.

closed account (48T7M4Gy)

I've been meaning to write a program along these lines for a long time as it has a lot of applications. This is close to but not exactly, of course, what you want.

#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>

using std::ifstream;
using std::string;
using std::cout;
using std::endl;
using std::getline;
using std::setw;

int main()
{
    string line = "";
    int frequency[128] = {0};
    
    ifstream source( "count.txt" );
    
    if (source.is_open())
    {
        while ( getline( source, line) )
        {
            cout << line << endl;
            
            for(int i = 0; i < line.length(); i++)
                frequency[ line[i] ]++;
        }
        
        for (int i = 0; i < 128; i++)
        {
            string bar(frequency[i]/30, '*');
            cout << setw(4) << i << setw(3) << (char)i << setw(4) << frequency[i] << ' ' << bar << endl;
        }
        
        source.close();
    }
    
    return 0;
}

JLBorges (13770)

#include <iostream>
#include <fstream>
#include <limits>
#include <string>
#include <algorithm>

int main ()
{
    const char* const path = __FILE__ ; // modify as required
    const char bar_chart_character = '*' ; // modify as required
    const int bar_chart_width = 80 ; // modify as required

    // unsigned char: It is implementation-defined whether a char object can hold negative values - IS
    constexpr std::size_t NCHARS = std::numeric_limits<unsigned char>::max() + 1 ;
    int frequency[NCHARS] {} ;

    {
        std::ifstream file(path) ;
        char c ;
        while( file >> c ) // all characters excluding (white) space
        {
            /*
                We rely on the following guarantees provided by the IS:
                    A char, a signed char, and an unsigned char have the same object representation.
                    For narrow character types, all bits of the object representation participate in the value representation.
                    For unsigned narrow character types, each possible bit pattern of the value representation represents a distinct number.

                    For each value i of type unsigned char in the range 0 to 255 inclusive, there exists a value j of type char
                    such that the result of an integral conversion from i to char is j,
                    and the result of an integral conversion from j to unsigned char is i.
            */
            static_assert( NCHARS < 257, "unsupported narrow character type" ) ;
            const unsigned char u = c ; // a char object may hold negative values
            ++frequency[u] ;
        }
    }

    const int* iter_max_element = std::max_element( std::begin(frequency), std::end(frequency) ) ;
    const auto max_frequency = *iter_max_element ;
    std::cout << "the most common letters with " << max_frequency << " occurrences each are: " ;
    for( std::size_t i = 0 ; i < NCHARS ; ++i ) if( frequency[i] == max_frequency )
        std::cout << '\'' << char(i) << "' " ;
    std::cout << "\n\n" ;

    const auto bar_chart_unit = max_frequency / (bar_chart_width+1) + 1 ;
    if( bar_chart_unit > 1 ) std::cout << '\'' << bar_chart_character << "' == " << bar_chart_unit << " occurrences\n\n" ;

    for( std::size_t i = 0 ; i < NCHARS ; ++i )
    {
        const auto nchars = ( frequency[i] + bar_chart_unit/2 ) / bar_chart_unit ;
        if(nchars) std::cout << char(i) << "  " << std::string( nchars, bar_chart_character ) << '\n' ;
        // TODO: if nchars > 0 && std::isprint( char(i), std::cout.getloc() ) is false, print the escape sequence
    }
}

// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // 75
// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ // 80
// $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
// $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
// @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
// @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

http://coliru.stacked-crooked.com/a/3f439215a26afa94

Last edited on

Topic archived. No new replies allowed.

C++

Forum

counting common char