This is a pretty significant homework assignment.
Yes, you need to keep a pairing of (word, number of occurrences). That's why you need two arrays.
I have just glanced at your code, but here are some obvious issues:
(1)
main()
You have a number of functions to do individual things, each one reading the file to the end.
However, in your main function, you have a
loop that tries to pass the same file, multiple times, to each function in succession. This won't work. Remember, after line 139, the file is at EOF. Lines 140 through 143 have no hope of working.
(2) Looping on EOF
1 2 3 4 5 6
|
// Bad! Don't do this!
while (!file.eof()) // While not at EOF
{
file >> whatever; // Try to read something
do_something( whatever ); // Do something with it
}
|
Do you see the problem? Line 4 may
fail, because you tried to read something and found EOF. Except, you ignore that fact and go ahead to line 5, which uses a garbage
whatever. That'll mess up your counts.
You should be reading the input file like this:
1 2 3 4 5 6
|
char line[ 101 ]; // 100 characters in input line plus null terminator!
while (inputFile.getline( line, 101 ))
{
// do something with 'line'
}
|
Dang. I've got to go for a bit. I'll come back and edit this post to finish posting the help you need.
[edit] Mmm. Ice cream with the kids!
(3) Newline baloney
It is easy enough to ignore blank lines. The instant you try to
strtok() anything out of it you'll get a
NULL, so it doesn't cost you anything. So ignore all the stuff about newlines and impress your professor.
(4) Functions
It is a good idea to break a problem into smaller problems, but what you have done is broken a big problem into a bunch of
separate problems. The problem with that is that the subproblems are not necessarily disjoint.
Let's list the things you have been asked to do:
1 - count the total number of words (ambiguous!)
2 - count the number of unique words
3 - count the number of times each word appears
4 - count the number of lines
5 - find the longest word
6 - find the shortest word
The first item is actually ambiguous. Does your professor mean the total number of individual words found in the file? Or does he mean the total number of different words found in the file? You can ask him, or just give both answers in your output.
You have also been told that you need to keep
- A list of every word you find in the file
- A list that counts the number of times each word occurs
That helps a lot with the first few items.
1 (total number of individual words) - print the sum of all elements in counts list (but don't do that. There's a simpler way)
1 (total number of different words) - print the number of words in your list of words
2 - print the number of words in your list with a matching count of 1
3 - print each word in the list and its count
4 - no help
5 - find and print the longest word in the list
6 - find and print the shortest word in the list
So, it looks like your primary problem is to construct the list of words and the matching counts. Along the way you should also be counting the number of lines in the file and number of times
strtok() gives you a word.
(Remember, if you add a word to the words list, you must also add a count of 1 to the counts list at the same index.)
So, here are the functions I suggest you need:
int FindWordInList( const char* word, const char words[][16], int size );
Returns the index in the list of the word if the word is in the list.
Returns -1 (or the current number of words in the list, your choice) if the word is not in the list.
int CountNumberOfUniqueWords( const int counts[], int size );
Unique words have a count of 1. It doesn't matter here what the word actually is -- we only care if it exists with a count of 1.
void PrintWordsAndCounts( const char words[][16], const int counts[], int size );
Does what it says.
int FindLongestWord( const char words[][16], int size );
Returns the index of the longest word.
int FindShortestWord( const char words[][16], int size );
Returns the index of the shortest word.
(5)
main(), redux
Now, your main function should be doing this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
|
ifstream inputFile( ... );
if (!inputFile) ...;
number of lines = 0
number of words in list = 0
number of words in file = 0
while (inputFile.getline( s, 101 ))
{
increment number of lines
start strtok()ing
while (strtok()'s result is not NULL)
{
increment number of words in file
if word is found in list:
update counts[]
otherwise:
append word to words[]
append 1 to counts[]
increment number of words in list
print stuff you have been asked to print here
}
}
PrintWordsAndCounts(...);
print staticstics:
print number of words in file
print number of words in words list
print CountNumberOfUniqueWords(...)
print longest word
print shortest word
|
(6) Misc
You don't need line 9. The
getline() function you should be using is a member function of the
ifstream class.
You don't need lines 11 through 15. Prototypes tell other functions what your function looks like
before your function is defined. There is no point in both prototyping
and defining your function before it is used.
1 2 3 4 5 6 7 8
|
int foo();
int foo() { cout << "fooey!\n"; return -7; }
int main()
{
cout << foo() << endl;
}
|
See how line 1 is totally superfluous?
Well, that's all you're getting out of me for now. Hope this helps.