Frequency in array of strings

Pages: 12
I have read a bunch of words from a file into an array of strings, and now I am trying to determine the frequency of each word in the array.

For example, if the input file was:
 
hello bye bye hello blue hello


I now have an array of strings that is:
1
2
3
4
5
6
A[0] = hello
A[1] = bye
A[2] = bye
A[3] = hello
A[4] = blue
A[5] = hello


What I am trying to do, basically is to achieve the output:
1
2
3
4
Frequency of words
#1. "hello" with 3 occurrences
#2. "bye" with 2 occurrences
#3. "blue" with 1 occurrences 


I'm not asking for the code to do this. I basically want to know if it is even possible, or if I stored the words in a bad way in order to accomplish this task.

Any help is appreciated.
Hint: map<string, int>
I have this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream>
#include <string>

using namespace std;

int main()
{

  string words[5] = { "hello" , "bye", "hello", "blue", "hello" };
  string t1, t2;
  int num[5] = { 0, 0, 0, 0, 0 };

  for(int j = 0; j < 5; j++)
  {
        t1 = words[j];
        for(int i = 0; i < 5; i++)
                if(t1 == words[i])
                        num[j]++;
  }

  for(int i = 0; i < 5; i++)
        cout << words [i] << " " << num[i] << endl;

return 0;
}


and it produces
1
2
3
4
5
hello 3
bye 1
hello 3
blue 1
hello 3


Which is something along the lines of what I am looking for... but not quite, as I need to give the 3 words that occurred the most, and how many times they occurred.
Pan, I didn't mention this, but I am not allowed to use container classes for this project.
What is it you are supposed to learn from this project?
Experience text processing techniques. Like I said though, im not looking for the code, just a little assistance.
One thing you can do is to break out of the counting loop as soon as you have found the word and counted it. Then just output all words and counts where the count is > 0.
Can you use pointers and new? You could make a dynamic array that adds a new element to its end every time it encounters a new word.

Or you could make your own container class (simple container) that holds strings and ints then add a new container every time a word is not in a class.
Normally I wouldn't type up this code, but to see if it was a workable example with pointers and dynamic new arrays I wrote a test code.

Its messier than what I'd normally make.
#snip
Last edited on
Thanks for doing that wolfgang, however I got it working before seeing this post (of course that would happen). I had basically did the same thing that you have done here, separating it into two separate arrays and checking if it was already, and incrementing a counter upon new/repeat words.

But thanks for your time. Yours is a little better than mine too lol.

Edit: spelling

Last edited on
That's a ton of ownership passing you're doing there, wolfgang.

=x
Disch wrote:
That's a ton of ownership passing you're doing there, wolfgang.

=x


Hmm? You mean giving out the code? I will probably cut it out. I'm not too bothered as I was just making sure I could get it to not explode with pointers (especially if the OP wanted to use said method then asked me how to get it done. I wouldn't know how to answer otherwise.)

Code being snipped now.
I think he meant stuff like "return (some pointer pointing to new'd data)".
Everything I did was passed by reference. I passed in a int* by reference and simply gave the address of a new allocation after deleting the old. It probably isn't the best way and comes with a gratuitous amount of overhead in the long run.
Yeah I didn't have a problem with the fact that you posted code.

Like firedraco suggested, I meant you're handing off responsibility for a dynamically allocated buffer to a portion of the program that didn't allocate it. That's what I meant by "passing ownership". Doing that makes code really hard to manage and maintain, and leads to memory leaks that are next to impossible to track down.
Ahh I see. Can you offer some ideas on how to make the allocations correctly?
Avoid dynamic allocation is the best way. :-)
What PanGalactic said.

In places where it's necessary, objectify it. Dynamically allocated memory should have a clear owner -- and that owner should be responsible for all cleanup.

This is particularly easy with STL, as it offers such containers already (like std::vector).
So mainly you recommend if I need to use dynamic memory, it should be handled by in objects. Okay. That's nice and simple enough. I was going to write a class for the code I made, but as I said when I made it I was making it quick and dirty. (REALLY DIRTY). I was more interested in the actual working of the idea, not the means.

If I ever made something for my college course I always deal with objects.
I prefer malloc to new[]. Why? Because of realloc. When making a dynamic array, new[] forces you to copy the array every time you resize it. That's crazy. Until they add renew[] or whatever, I'll (and I believe you should) only use new, never new[].
Last edited on
Pages: 12