Letter Frequency

This program will analyze a text file given by the user and report the frequency of letters within the file. The program is not case-sensitive, so all alphabetical characters will be counted and reported. A sample run could look something similar to the following.

Greetings!  My name is Brent and I will be analyzing a text file for you.      
Please enter the name of the file:  dictionary.txt
Here is a report on the frequency of each alphabetical character in the file "dictionary.txt".
Letter  Frequency
A          55    
B          31
...
Z           2 
Thank you for using this program.  Goodbye.


here is my code so far

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <fstream>
#include <iostream>
 
int main()
{
	std::ifstream input("filename.txt", std::ios_base::binary);
	if (!input)
	{
		std::cerr << "error: can't open file\n";
		return -1;
	}
 
	size_t count[256];
	std::fill_n(count, 256, 0);
 
	for (char c; input.get(c); ++count[uint8_t(c)]) // process input file
		; // empty loop body
 
	for (size_t i = 0; i < 256; ++i)
	{
		if (count[i] && isgraph(i)) // non-zero counts of printable characters
		{
			std::cout << char(i) << " = " << count[i] << '\n';
		}
	}
}
Last edited on
First, please use code tags (and logical indentation) when posting code. See https://www.cplusplus.com/articles/jEywvCM9/

Second: Is there a question?
I wanted to know if this code is correct.
No, it does not appear to be correct. But part of computer science is learning how to form a hypothesis and run an experiment to test that hypothesis. For example, given a file that contains "A,a" (uppercase A, comma, lowercase A), what do you want the output of your program to be?

Based on your description, it should print
A 2

(It's not clear if it should print 0 for all the other letters.)

isgraph is found in the <cctype> header, so it should be #include'd as well. Perhaps you should be using isalpha instead.

Other things:
- Your sample text expects the user to enter a filename, but your program does not do this.
- Your sample text counts letters A-Z; your program appears to track every printable character, and is case sensitive.
- To avoid case-sensitivity, you could convert every character to uppercase with toupper.
Last edited on
So I have this other program that finds a word in a file but I don't know how to convert this to tell how many of each letter there is in the file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <bits/stdc++.h>
using namespace std;

ifstream inFile;

string upCase(string strIn){
	string strOut;
	for(int s = 0; s < strIn.length(); s++){
		strOut += toupper(strIn[s]);
	}
	return strOut;
}

int main(){
	
	int wordCount;
	string word;
	string target;
	
	inFile.open("dictionary.txt");
	if(!inFile){
		cout << "Error: Unable to open file.\n";
		exit(0);
	}
	
	inFile >> wordCount;
	
	string strArray [wordCount];
	for(int w=0;w < wordCount; w++){
		inFile >> word;
		word = upCase(word);
		strArray[w] = word;
	}
	
	cout << "Enter a word: ";
	cin >> target;
	target = upCase(target);
	cout << "word as uppercase: " << target << endl;
	bool wordFound = false;
	for(int p = 0; p < wordCount; p++){
		if(target == strArray[p]){
			cout << "Word found at " << p << endl;
			wordFound = true;
			break;
		}
	}
	
	if(!wordFound){
		cout << "Word not found.\n";
	}
	
	return 0;
}
The idea you had of storing the frequencies in an array is fine, you just need to ignore characters that aren't A-Z, and convert every letter to uppercase.

For an array, you want 'a' and 'A' to convert to 0, 'b' and 'B' --> 1, ... 'z' and 'Z' --> 25.
Assuming ASCII, you can add or subtract 'A' to achieve this.
('A' - 'A' == 0, 'B' - 'A' == 1, ..., 'Z' - 'A' == 25)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Example program
#include <iostream>
#include <string>
#include <cstddef>
#include <cctype>

// https://en.cppreference.com/w/cpp/string/byte/toupper
char my_toupper(char ch)
{
    return static_cast<char>(std::toupper(static_cast<unsigned char>(ch)));
}

int main()
{
    const int NumLetters = 26;
    size_t frequencies[NumLetters] {}; // 0-init'd

    std::string phrase = "The quick brown fox jumps over the lazy dog";
    
    for (char ch : phrase)
    {
        int index = my_toupper(ch) - 'A';
        frequencies[index]++;
    }

    for (int i = 0; i < NumLetters; i++)
    {
        std::cout << (char)('A' + i) << " : " << frequencies[i] << '\n';
    }
}
Last edited on
How would you get this to read a file and output the letters?
Also, is there another way of doing the toupper?
Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <fstream>
#include <cctype>

char my_toupper(char ch)
{
	return static_cast<char>(std::toupper(static_cast<unsigned char>(ch)));
}

int main()
{
	const int NumLetters {26};
	size_t frequencies[NumLetters] {};
	std::ifstream ifs("test.txt");

	if (!ifs)
		return (std::cout << "Cannot open file\n"), 1;

	for (char ch {}; ifs.get(ch); )
		if (std::isalpha(ch))
			++frequencies[my_toupper(ch) - 'A'];

	for (int i = 0; i < NumLetters; ++i)
		std::cout << (char)('A' + i) << " : " << frequencies[i] << '\n';
}

Topic archived. No new replies allowed.