Word counting (multiple word occurrence)

Hi folks,

I'm trying to write a function that will take an input string, entered by the user. And count each occurrence of each word. So for example if the input string is "first this is a test, yes it is a test last.". The output should read:
The word 'first' occurs 1 time.
The word 'this' occurs 1 time.
The word 'is' occurs 2 times.
...
The word 'test' occurs 2 times.
And so on.
I've managed to get it to count the words, but it counts words that occur more than once as different words. I figured the way around this was to create a vector<string> that contained each word individually, and another vector<string>, which i would add a word to only once. Then run through both and seeing if a word is already present in the one instance vector<string>, and if so, not add it in again. This is where i am having the problem.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
     
for(int i = 0; i < WScount + 1; i++)
{       
     for(int j = 1; j < WScount + 1; j++)
     {
          if (allword.at(i) == oneword.at(j))
          {
                       //dont put in
                       break;
          }
          else
          {
                 oneword.push_back(allword.at(i));
                 break;
          }
     }
}

This is my current attempt at it, but it either crashes the program, or the vector<string> containing only one of each word....just ends up containing all the words present in the original vector<string>.
The WScount is the White space count of the string, aka WScount + 1 is the number of words.


Any help would be appreciated. Or if i'm going in the wrong direction to solve this problem please point me in the right one.
Many thanks.
Consider using a std::map where the keys are the words and the data is the number of times
it occurs.
Thanks for reply.

I dont really know much about std::maps, i could look it up of course, but would i not still need some way to determine if a particular word is already present before putting it in the map?.
try doing this,
u make a function that accepts the pointer to the first letter in the string in ur case a pointer to f
after that u take out word by word and put it in a temp string, compare it to strings that u have in your vector<string> if it is not there u push_back it. Then u use a loop to send every string from the vector to a counter function, wich again gets the inputed string and counts how many times u have some word in it. and that function can do the print of the data.

i hope u understand what i am saying, and it helps you

I would recommend using two vectors.

Via a for loop, check each word (a word is a piece of text surrounded by whitespaces).

If a word is encountered hasn't been stored in vector 1, add it to vector one using push_back, and then add a counter to vector 2 that will indicate the number of times the word in the same position in vector one has been found.

If a word that is already in vector 1 has been found, increment the respective integer in vector 2 by one.

When you get to the end, read out both vectors using a for loop.

-Albatross
I think i understand what you're saying, and i think i've already tried something similar, and it did not work. This is what i tried:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//This is where call is made.
if(IsnotPresentinVector(eachword[i], oneword))
               {
                     oneword.push_back(eachword[i]);

               }
//This is the testing function, it accepts via the vector<string> eachword the string containing the word (aka word by word), and it accepts the other vector<string> oneword, which is to have only one instance of each word.
   bool IsnotPresentinVector(string str, vector<string> &oneword )
      {
 

   
            for (int i = 0; i < oneword.capacity(); i++)
            {
                if (str == oneword.at(i) )
                {
                     return false; // Word found
                }
  
                return true; //word not found
            }
      }


This did not work, the vector<string> oneword got filled with all the words from eachword anyway.
Last edited on
Your return true; is inside the for loop, after your if statement. Oops.

-Albatross
I would recommend using two vectors.

Via a for loop, check each word (a word is a piece of text surrounded by whitespaces).

If a word is encountered hasn't been stored in vector 1, add it to vector one using push_back, and then add a counter to vector 2 that will indicate the number of times the word in the same position in vector one has been found.

If a word that is already in vector 1 has been found, increment the respective integer in vector 2 by one.

When you get to the end, read out both vectors using a for loop.

-Albatross


Yeah thats what i've been doing, the problem is the "If a word is encountered hasn't been stored in vector 1, add it to vector one using push_back" part which i cant seem to get to work?
Your return true; is inside the for loop, after your if statement. Oops.

-Albatross



Ah indeed, stupid me, fixed that now. However problem, that function(fixed) and it's implementation is now causing a runtime error...eek?.
Runtime errors are much more annoying.... by the way, where is your second vector?

And what is the runtime error?

-Albatross

200 posts and counting.
Runtime errors are much more annoying.... by the way, where is your second vector?

And what is the runtime error?

-Albatross


Em there are two vectors:, vector<string> eachword and vector<string> oneword.
If you mean in the function IsnotPresentinVector() then the string str is each individual word taken for the vector<string> eachword.

As for the error i don't know all i'm gettin is runtime error application terminated, is there a way to find out more?
Ok for some reason that god only knows i was passing oneword in by reference???, guess that can happen when you've tried a million and one ways to fix a problem lol. Anyway the function appears to work now....however i'm still getting a runtime error, i'm assuming somewhere i'm trying to access a new element of the vector that doesnt exist?.
Hmm further after the function call i have, just for testing purposes:
1
2
3
4
5
     for(int i = 0; i < oneword.capacity(); i++)
     {
             cout << oneword.at(i);
     }

It would appear this is causing the runtime error, any ideas?. It works as intended and output oneword, but then causes the error.
Last edited on
Nevermind, it should be oneword.size(), now it works. Just have to figure a way to get counts in right order and that should be it. Thanks a bunch
Topic archived. No new replies allowed.