Getting the wrong output after constructing a Red-Black tree from csv file

I'm tasked with opening a csv file, reading from the two columns, and mapping the values (name, number) to a red-black tree. We are to test adding the first 5 elements and performing a traversal to ensure the tree was constructed correctly according to the rules of a r-b-tree. Then we add the next 5 elements and perform the same traversal to ensure again that the 10 element r-b-tree is in accordance with it's rules. For these first two tests, the traversals are correct. The issue occurs when I add the rest of the elements, which are over 3000. The reason I know there is an issue is because calling the four find() in main() after inserting all the elements does not match with the values from the first four find() calls. I believe my bug has to be either in the checkBalance() method or find() method. (had to post the code in pieces due to formatting)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
#pragma once
#include <vector>
#include <iostream>
#include <string>
#include <memory>
#include <stdexcept>
using namespace std;

template <typename TKey, typename TValue>
class RedBlackTreeMap
{
private:
	class Node
	{
	public:
		TKey mKey;
		TValue mValue;
		Node* mParent;
		Node* mLeft;
		Node* mRight;
		bool mIsRed;

		Node(const TKey &key, const TValue &value, bool isRed)
			: mKey(key), mValue(value), mIsRed(isRed), mParent(nullptr),
			mLeft(&NIL_NODE), mRight(&NIL_NODE)
		{
		}
	};

	Node *mRoot;
	int mCount;
	static Node NIL_NODE;

	Node* bstFind(const TKey &key, Node *currentNode) const
	{
		// TODO: write this method. Given a key to find and a node to start at,
		// proceed left/right from the current node until finding a node whose
		// key is equal to the given key.
		// Use < and > to compare the keys of nodes to determine left/right.
		if (currentNode->mKey == key) { return currentNode; }
		else {
			if (currentNode->mKey < key) {
				// key is larger; go right.
				if (currentNode->mRight != &NIL_NODE) // node to right is not a NIL leaf
					return bstFind(key, currentNode->mRight); // recursively move down tree
				else
					return nullptr;
			}
			else { // currentNode->mKey > key
				// newNode is smaller; go left.
				if (currentNode->mLeft != &NIL_NODE) // node to left is not a NIL leaf
					return bstFind(key, currentNode->mLeft); // recursively move down tree
				else
					return nullptr;
			}
		}
	}

	Node* getGrandparent(Node *n) {
		// TODO: return the grandparent of n
		Node* p = n->mParent;
		if (p == nullptr) // n has no parent
			return NULL;
		else if (p->mParent == nullptr) // n has no grandparent
			return NULL;
		else 
			return p->mParent;
	}

	// Gets the uncle (parent's sibling) of n.
	Node* getUncle(Node *n) {
		// TODO: return the uncle of n
		Node* p = n->mParent,
			* u = NULL;
		if (getGrandparent(n) == NULL)
			return NULL;
		else {
			if (p == getGrandparent(n)->mRight) // parent is right child
				u = getGrandparent(n)->mLeft; // uncle will be left child
			else // parent is left child
				u = getGrandparent(n)->mRight; // uncle will be right child
			return u;
		}
	}

	// Rotate the tree right at the given node.
	void singleRotateRight(Node* n) {
		Node*l = n->mLeft, // prior to rotate, n's child l is to left of n
			*lr = l->mRight, // prior to rotate, l's child lr is to right of l
			*p = n->mParent; // if n has a parent, l will become its new child after rotate

		n->mLeft = lr; // n adopts lr after rotate
		lr->mParent = n; // symmetry

		l->mRight = n; // n becomes l's right child after rotate
		if (p == nullptr) { // no parent above n exist
			mRoot = l;
		}
		else if (p->mLeft == n) { // n used to be its parent's left child
			p->mLeft = l;
		}
		else { // n used to be its parent's right child
			p->mRight = l;
		}
		n->mParent = l; // l becomes n's parent after rotate
		l->mParent = p; // l becomes p's child (if p exist) after rotate
	}

	// Rotate the tree left at the given node.
	// Mirror of singleRotateRight
	void singleRotateLeft(Node *n) {
		// AVL tree calls this a "rr" rotation at n.
		Node*r = n->mRight, // prior to rotate, r's child l is to right of n
			*rl = r->mLeft, // prior to rotate, r's child rl is to left of r
			*p = n->mParent; // if n has a parent, r will become its new child after rotate

		n->mRight = rl; // n adopts r after rotate
		rl->mParent = n; // symmetry

		r->mLeft = n; // n becomes r's left child after rotate
		if (p == nullptr) { // no parent above n exist
			mRoot = r;
		}
		else if (p->mLeft == n) { // n used to be its parent's left child
			p->mLeft = r;
		}
		else { // n used to be its parent's right child
			p->mRight = r;
		}
		n->mParent = r; // r becomes n's parent after rotate
		r->mParent = p; // r becomes p's child (if p exist) after rotate
	}

	// This method is used by insert. It is complete.
	// Inserts the key/value into the BST, and returns true if the key wasn't
	// previously in the tree.
	bool bstInsert(Node* newNode, Node* currentNode) {
		if (mRoot == nullptr) {
			// case 1
			mRoot = newNode;
			mCount++;
			return true;
		}
		else {
			if (currentNode->mKey < newNode->mKey) {
				// newNode is larger; go right.
				if (currentNode->mRight != &NIL_NODE) // node to right is not a NIL leaf
					return bstInsert(newNode, currentNode->mRight); // recursively move down tree
				else {
					currentNode->mRight = newNode;
					newNode->mParent = currentNode;
					mCount++;
					return true;
				}
			}
			else if (currentNode->mKey > newNode->mKey) {
				// newNode is smaller; go left.
				if (currentNode->mLeft != &NIL_NODE) // node to left is not a NIL leaf
					return bstInsert(newNode, currentNode->mLeft); // recursively move down tree
				else {
					currentNode->mLeft = newNode;
					newNode->mParent = currentNode;
					mCount++;
					return true;
				}
			}
			else {
				// found a node with the given key; update value.
				currentNode->mValue = newNode->mValue;
				return false; // did NOT insert a new node.
			}
		}
	}
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
// Applies rules 1-5 to check the balance of a tree with newly inserted
	// node n.
	void checkBalance(Node *n)
	{
		Node* G = getGrandparent(n);
		if (n == mRoot) {
			// case 1: n is the root
			n->mIsRed = false; // root cannot be red, must be changed to black
		}
		// handle additional insert cases here.
		// Note that each case is mutually exclusive, EXCEPT THAT case 4 must flow
		// into case 5.
		// Also note that case 3 recursively examines the grandparent after the
		// recoloring.
		else if (n->mParent->mIsRed == false) // parent P is black
			// case 2: arbitrary parent node is black and one of it's black NIL children was replaced by a red node after 
			// insert(). Red node n also has two black NIL children.
			return; // new node already assigned to be red in insert()

		else { // P is red for all remaining cases
			if (getUncle(n) != NULL && getUncle(n)->mIsRed == true) { // uncle exist and is red
				// Case 3: n's uncle is red, so we need to recolor n's parent and uncle to be black and n's grandparent 
				// to be black (could potentially violate rule that root cannot be red or that red nodes have two black children. Must
				// recursively call "fix" or getBalance procedure.
				n->mParent->mIsRed = false; // recolor parent to black
				getUncle(n)->mIsRed = false; // recolor uncle to black
				getGrandparent(n)->mIsRed = true; // recolor grandparent to be red
				checkBalance(getGrandparent(n)); // recursively fix if root is not red or a red node does not have two black children
			}
			else if (getUncle(n) != NULL) { // uncle exist but is black
				// case 4: assume P is left child of G (if not, do the mirror of following procedure). n is left-right or right-left
				// grandchild of G. Call single rotate at P (won't fix issue, but allows us to go to case 5).
				if ((n->mKey < getGrandparent(n)->mKey) && (n->mKey > n->mParent->mKey)) // parent to left of Grandparent and n to right of parent
					singleRotateLeft(n->mParent); // single rotate left @ parent node
				else if ((n->mKey > getGrandparent(n)->mKey) && (n->mKey < n->mParent->mKey)) // mirror above process
					singleRotateRight(n->mParent); // single rotate right @ parent node

				// case 5: Arrive here if cases 1-3 are false. All case 4's will lead into case 5. Need to do a single rotate on G 
				// and switch colors of P and G: P turns black, G turns red.
				if (n->mKey < G->mKey) { // rotate right @ G: fix left-left imbalance after case 4
					singleRotateRight(G);
					if (n == mRoot) { // can occur coming from case 4
						n->mIsRed = false; // recolor root to black
						n->mRight->mIsRed = true; // recolor right child to be red
					}
					else {
						n->mParent->mIsRed = false; // recolor parent to black
						G->mIsRed = true; // recolor grandparent to be red
					}
				}
				else { // rotate left @ G: fix right-right imbalance after case 4
					singleRotateLeft(G);
					if (n == mRoot) {
						n->mIsRed = false; // recolor root to black
						n->mLeft->mIsRed = true; // recolor left child to be red
					}
					else {
						n->mParent->mIsRed = false; // recolor parent to black
						G->mIsRed = true; // recolor grandparent to be red
					}
				}
			}
		}
	}

public:
	RedBlackTreeMap() : mRoot(nullptr), mCount(0) {
	}

	// Return number of elements in the tree 
	inline int count() const {
		return mCount;
	}

	// Inserts a key/value pair into the tree, updating the red/black balance
	// of nodes as necessary. Starts with a normal BST insert, then adjusts.
	void insert(const TKey &key, const TValue &value) {
		Node* n = new Node(key, value, true); // the node starts red, with null parent, left, and right.

		// normal BST insert; n will be placed into its initial position.
		// returns false if an existing node was updated (no rebalancing needed)
		bool insertedNew = bstInsert(n, mRoot);
		if (!insertedNew)
			return;

		// check cases 1-5 for balance.
		checkBalance(n);
	}

	TValue &find(const TKey &key) {
		Node *n = bstFind(key, mRoot);
		if (n == nullptr || n == &NIL_NODE)
			cout << "Key not found";
		else {
			if (n == mRoot) cout << "This is a root: ";
			else if (n->mRight == &NIL_NODE && n->mLeft == &NIL_NODE)
				cout << "This is a leaf: ";
			else if ((n->mRight == &NIL_NODE && n->mLeft != &NIL_NODE) || (n->mRight != &NIL_NODE && n->mLeft == &NIL_NODE))
				cout << "This node has one child: ";
			else if (n->mIsRed == true) cout << "This node is a red: ";
			return n->mValue;
		}
	}

	// Returns true if the given key is in the tree.
	bool containsKey(const TKey &key) const {
		// TODO: using at most three lines of code, finish this method.
		// HINT: write the bstFind method first.
		if (bstFind(key, mRoot) != nullptr || bstFind(key, mRoot) != &NIL_NODE) return true;
		else return false;
	}

	// Prints a pre-order traversal of the tree's nodes, printing the key, value,
	// and color of each node.
	void printStructure(Node* n) {
		string color;
		// TODO: a pre-order traversal. Will need recursion.
		if (n == NULL || n == &NIL_NODE)
			return;
		else {
			// first print data of node
			if (n->mIsRed == true) color = "Red";
			else color = "Black";
			cout << n->mKey << " : " << n->mValue << " (" << color << ") " << endl;

			// recur on left subtree
			printStructure(n->mLeft);
			// recur on right subtree
			printStructure(n->mRight);
		}
	}
	void printStructureHelper() {
		printStructure(mRoot);
	}
};
template <typename TK, typename TV>
typename RedBlackTreeMap<TK, TV>::Node RedBlackTreeMap<TK, TV>::NIL_NODE{ TK(), TV(), false };

main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include "redBlackTreeMap.h"
#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
using namespace std;

int main() {
	int x = 0;

	ifstream ip("players_homeruns.csv");
	if (!ip.is_open()) std::cout << "Error opening file" << endl;

	string name;
	string hits;
	
	RedBlackTreeMap<string, int> obj1;
	
	// Build tree with first 5 
	for (int i = 0; i < 5; i++) { 
	//while (ip.good()) {
		getline(ip, name, ',');
		getline(ip, hits,'\n');

		stringstream st(hits);
		st >> x;
		// cout << "Name and hits:" << name << ' ' << hits << endl; // test that file opens
		obj1.insert(name, x);
	}
	cout << "Pre-Ordered Traversal with first 5 players: " << endl;
	obj1.printStructureHelper();
	
	// Build tree with first 10 
	for (int i = 5; i < 10; i++) {
		getline(ip, name, ',');
		getline(ip, hits, '\n');

		stringstream st(hits);
		st >> x;
		obj1.insert(name, x);
	}
	cout << endl;
	cout << "Pre-Ordered Traversal with first 10 players: " << endl;
	obj1.printStructureHelper();
	
	cout << obj1.find("Babe Ruth") << endl; // one key that is a leaf
	cout << obj1.find("Honus Wagner") << endl; // one key that is a root
	cout << obj1.find("Rogers Hornsby") << endl; // one key that has only one non-NIL child 
	cout << obj1.find("Ted Williams") << endl; // one key that is a red node
	
	// Finish building tree
	for (int i = 10; i < 3836; i++) {
		//cout << i << endl;
		getline(ip, name, ',');
		getline(ip, hits, '\n');

		stringstream st(hits);
		st >> x;
		obj1.insert(name, x);
	}
	
	cout << endl;
	cout << "Pre-Ordered Traversal with all players: " << endl;
	obj1.printStructureHelper();

	cout << obj1.find("Babe Ruth") << endl; // one key that is a leaf
	cout << obj1.find("Honus Wagner") << endl; // one key that is a root
	cout << obj1.find("Rogers Hornsby") << endl; // one key that has only one non-NIL child 
	cout << obj1.find("Ted Williams") << endl; // one key that is a red node
	
	ip.close();
	exit(0);
}
Last edited on
What happens if the find fails to find the name?

Also your file input looks a little wonky to me. What happens if your file doesn't contain enough lines for the for() statements? Will your insert insert "blank" entries? Why are you using the stringstream instead of just retrieving the numbers?

Edit:
Also IMO, your indentation needs some work.
Last edited on
When find() fails to find the name, it outputs "Key not Found" along with a number. Also, I used stringstream because the file contains both strings and numerical values. I have opened the file and that is how I know how many lines to iterate through in the for-loop. In parsing through the 3000+ lines, I did not see one that was blank
Also, I used stringstream because the file contains both strings and numerical values.

So? Don't you realize that you can retrieve numbers with the extraction operator from the file stream?

I have opened the file and that is how I know how many lines to iterate through in the for-loop.

But what happens if the file is different?

And why do your later loops start with something other than zero? And why the magic numbers?

Since your tree is not array based you don't need to keep track of the index provided by the for loop. IMO, a while loop (possibly inside a function to reduce code duplication) would probably be better.

For the last loop something more like:

1
2
3
4
5
6
7
8
    std::string name;
    int hits;
    
    // Using the std::ws manipulator to discard any leading whitespace.
    while(getline(ip >> std::ws, name, ',') && ip >> hits)
    {
	obj1.insert(name, hits);
    }

Last edited on
We were tasked with building the tree from the first 5 elements, then test the traversal. Then we add an additional 5 elements to the tree (so build the tree from the first 10 elements in the file). So my logic in starting the loops at different numbers is to continue building the tree from where the previous for-loop stopped. I know my logic in doing this is correct because starting the second for-loop at 0 and incrementing to 10 gave a traversal with 15 elements. That is why I didn't use a while loop to begin with.
Last edited on
When find() fails to find the name, it outputs "Key not Found" along with a number
Yes, but what does find() return? It's supposed to return a reference to a value. How can it do that if there is no matching value?

I would combine find() and containsKey() into:
1
2
3
// Find "key". If you find it, then return true and point "found" at the corresponding value.
// Otherwise return false and leave "found" unchanged.
bool find(const TKey &key, TValue *found);


As for the problem of 30,000 items vs. 10, try picking 10 items at random for the test. Run that a few times. If this doesn't reveal the bug then write a function to validate the structure of the tree and call it after each insertion. Eventually you'll find the case that causes the problem.
I know my logic in doing this is correct because starting the second for-loop at 0 and incrementing to 10 gave a traversal with 15 elements.

But why would you increment to 10 if you only want to add 5 elements? Do you realize that unless you rewind the file stream after reading the first 5 elements the stream will be "pointing" at the 6th element so you would only need to iterate 5 times to get the "next" 5 elements?

Topic archived. No new replies allowed.