How can I count words in a BinarySearchTree

Hello
I have a Binary Search Tree that reads words from a text file. I want to make a method that counts every same words and prints something like this:
word: 1
anotherword : 5
etc.

I hope someone can help me
Thanks

BinarySearchTree.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#ifndef BINARYSEARCHTREE_H
#define BINARYSEARCHTREE_H

#include <string>
#include <stdexcept>
using namespace std;

#include "AbstractWordCounter.h"

class BinarySearchTree: public AbstractWordCounter {
private:
	struct tree_node {
		tree_node* left; // left subtree has smaller elements
		tree_node* right; // right subtree has larger elements
		string data;
	};
	tree_node* root;
	void print_inorder(tree_node*);
public:
	BinarySearchTree() {
		root = NULL;
	}
	bool isEmpty() const {
		return root == NULL;
	}
	virtual void insert(string);
	virtual void print();
};
#endif 


BinarySearchTree.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include <iostream>
#include <cstdlib>
#include <sstream>
#include <string>

using namespace std;

#include "BinarySearchTree.h"

// Smaller elements go left
// larger elements go right
void BinarySearchTree::insert(string d) {
	tree_node* t = new tree_node;
	tree_node* parent;
	t->data = d;
	t->left = NULL;
	t->right = NULL;
	parent = NULL;
	// is this a new tree?
	if (isEmpty())
		root = t;
	else {
		//Note: ALL insertions are as leaf nodes
		tree_node* curr;
		curr = root;
		// Find the Node's parent
		while (curr) {
			parent = curr;
			if (t->data > curr->data)
				curr = curr->right;
			else
				curr = curr->left;
		}
		if (t->data < parent->data)
			parent->left = t;
		else
			parent->right = t;
	}
}

void BinarySearchTree::print() {
	print_inorder(root);
}

void BinarySearchTree::print_inorder(tree_node* p) {
	if (p != NULL) {
			print_inorder(p->left); //print left subtree
			cout << p->data << ": " << endl; // print this node
			print_inorder(p->right);  // print right subtree
	}
	else
		return;
}


Last edited on
(i) find the first node with <word>, set counter to 1
(ii) traverse tree until you the data is no longer <word>, incrementing the counter with every node
(iii) counter-1 is then the result
Since you have a node-oriented tree ordered inorder, you will have to traverse the tree node-by-node, inorder.

The canonical way (i.e., how the STL does it) would be to write an iterator and a find_first/find_last method, and then calling std::distance(find_first(word), find_last(word)). But that's perhaps a bit too much overhead (the underlaying idea is the same: you visit each element between the first and last occurrence of <word>, incrementing a counter with each visited element).
Could you please write some of the codes?

I have tried a lot but I have not succeeded
An easier method.

Make a map<string, int> mWordCount; and go through the tree, on every node do mWordCount[currentWord]++; and once finished iterate through the map and print out all of the entries.
Could you please write some of the codes?

No. (Well, technically, yes, but I won't)
I have tried a lot but I have not succeeded

What have you tried, and what went wrong?

@Zaita: The O(n log n) version of yours is perhaps a little bit suboptimal, don't you think? Especially since it can be done in O(log n + m) with a m<<n in most cases (and the fact that he uses trees in the first place somehow suggests that he (or his supervisor) is interested in efficient algorithms...).
@Exception: We don't know if the tree has been sorted or not.
My solution requires 1 iteration through the entire tree keeping score along the way. Yours is effectively the same, but assumes the tree has been sorted. Your's would fail if the tree was un-sorted or sorted unconventionally.

Since it is a binary *search* tree, I assumed it was sorted (of course, the key and the data could be different, but this is not the case with the implementation provided). Furthermore, you iterate over the tree, performing O(log n) operations (insertion/access of a map element) in each step, thus introducing an additional log n factor, which isn't there if you only perform the iteration and counting (even if m=n in my post above, thus requiring O(n) time, or the tree is indeed not sorted. In this special case your algorithm would be as fast (since the map has size 1), but that isn't the average case).

That being said, it doesn't seem that the original poster is interested anymore (or ever was in anything but a copy&paste solution), and I doubt that you *really* are interested in the complexity analysis of basic tree operations.
Topic archived. No new replies allowed.