After almost 2 years without touching any C or C++, now I need to do it again, finally :) even though I've forgotten a lot things.
I've created a program that calculates the frequencies of characters read from the standard input. This program should at the end output (from the most frequent to the least one) all characters followed by their respective frequencies.
My idea was to create an array first where the frequencies of the characters will be put.
After that I calculate how many different characters have been read (n). I create a struct (Char) which will contain both the character and its corresponding frequency, and then I allocate a buffer for n Char. Then I use qsort to sort the buffer of Char. And finally I print the characters followed by their frequencies.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
|
#include<stdio.h> // getchar, EOF
#include<stdlib.h> // qsort, calloc
#include<ctype.h> // isprint
typedef struct{
int freq;
char val;
} Char; // alias
int comp_char(const void* a, const void* b){
return (((Char*)b)->freq) - (((Char*)a)->freq);
}
void read_stdin(int* f){
int c;
while((c = getchar()) != EOF) ++f[c];
}
int main(){
int f[256] = {0};
read_stdin(f);
// Calculating the number of characters read from the standard input.
int n = 0;
for(short i=0; i<255; ++i)
if(f[i] != 0) ++n;
// Allocating a buffer of size n for Char objects.
Char* chars = calloc(n, sizeof(Char));
// Setting the val and freq fields of Chars
for(short i=0, j=0; i<255; ++i){
if(f[i] != 0) {
chars[j].freq = f[i];
chars[j++].val = (char)i;
}
}
// Sort Char elements by their freq field.
qsort(chars, n, sizeof(Char), comp_char);
// Print Char's val followed by the respective Char's freq.
for(short i=0; i<n; ++i){
if(isprint(chars[i].val))
printf("'%c' %i\n", chars[i].val, chars[i].freq);
else
printf("%3o %i\n", chars[i].val, chars[i].freq);
}
// Freeing memory allocated for chars.
free(chars);
return 0;
}
|
My first question is:
1. Is the program correct?
2. Are there parts where the risk of undefined behaviour is greater than 0?
3. Is it correct to cast i to a char using (char)i? In this case, I'm assuming that the characters in the file will only be ASCII ones. But what if they weren't? How can I change my program to deal with this cases?
4. In general, can I improve my program? If yes, where?
Note:
1. This is actually a C program
2. I would like that both the file can be passed to the program as ./name_of_this_file < file_to_count.txt or simply ./name_of_this_file and typing words, and after pressing control-d to reach EOF.
3. The program should be as efficient as possible.
Thanks a lot for all the help!