this is the first step ... read the data.
1 2 3 4 5 6 7 8 9 10
|
const unsigned int MB = 1048577;
int main()
{
char big[MB];
cin.getline(big,MB,'#');
int fullsize = cin.gcount();
for(int i = 0; i < fullsize; i++)
cout << big[i];
}
|
from here you can slice it up by whitespace, end of lines and spaces. you can replace the whitespace with '\0' and take a new array of pointers to the start of each word and that gives you a list word-wise. Now you need to re-create strcmp so you can sort the data, and then if you can't use sort() you need to write a sort routine using your compare. Once it is sorted, THEN you can check for duplicates by the simple 'is this one same as next one when sorted' dumb counter loop.
so counting the words and the duplictes are the easy part, splitting it up will be new to you but is not too hard, and sorting it if you have to write your own the hardest piece. You don't have to use pointers to split it, you can use array index if you prefer, its the same end result.
If any of that does not make sense I can go into more details later. But I won't write it all. Get some code down, try to do these things, post the code and ask a question if you get stuck. Doing it my way, you only need 1 string.h routine: strcmp () which you can re-create easily and even improve it to not care about case of the letters.
--?? requirement?? I added +1 to MB to account for the # symbol / end of string .. just in case. I can't see someone typing a MB to the console but your prof may redirect a large text file into your program to test it.
the for loop just echos what was typed so you can play with it a bit.