I DO NOT NEED A WHOLE PROGRAM WRITTEN FOR ME I WOULD JUST LIKE TO KNOW HOW TO START IT AND STUFF ESPECIALLY INPUTING THE FILES.
You will create a program that determines the language of a given input file based on the words in that file. The structure of the input files will be one word per line (all lowercase letters). You should read these words in one LETTER at a time.
Each input file will contain 100000 letters, and be in a different language.
The way you are able to tell the languages apart is as follows:
English: The top letter frequencies are e, i, and a respectively.
Danish: The top letter frequencies are e, r, and n respectively.
Italian: The letters j, x, y do not exist in Italian (other than proper nouns, which are not present in the input files).
Your program MUST include 5 functions in addition to your main() function:
A void function that will take an array of size 100000, where each element is a letter from the input file, and another array of size 26 (size of the alphabet), where each element is a number indicating the frequency of occurrence (how often the letter shows up in the first array). This function will be used to fill the second (occurrence) array. The occurrence should be a percentage. HINT: The following code will be helpful when trying to determine which array position in the second array to increment:
temp_char = a[i]; // array a is the array of size 100000, with all letters
b[temp_char - 'a']++; // b is the array of size 26
A void function that will initialize an array of characters of size 26 so that element 0 = ‘a’, and so on until element 25 = ‘z’. HINT: The following code will be helpful:
alpha[i] = (char)i+97; // where alpha is the array is of type char, i is the element
// this uses type-casting to a char; 97 is ASCII-decimal code for ‘a’
A void function that sorts two arrays in parallel, both of size 26. One array will have the occurrence of letters, the other will be the char array of the alphabet. You want to sort in decreasing order so that the highest frequency is the first element (in one array) and its corresponding letter is also the first element (in the other array).
A value-returning function that returns the percentage of occurrence of a specific letter that is passed into the function. You will use this function to determine if the occurrence of j, x, y is
Programming Project 5 1 CMPSC 201 - Spring 2013
zero. You may call it in other locations/situations if you wish.
A void function that takes in the occurrence array and the alphabet array and determines which language the file contains, based on the above assumptions of the languages. It will print out to standard output the language that was determined; if it cannot determine the input to be one of the above languages then it should just print out “the language cannot be determined”.
Your main function will open an input file and read the letters into an array. Call the functions above to first get the occurrence as a percent for each letter, and then determine the language used.
Also, include the following for testing:
An if block based on a boolean flag
If the bool is set to true it will print out the occurrence of each letter (sorted) using 2 columns: letters and frequency
If the bool is set to false, this printing will not happen
You will be setting the bool (manually, for each run).