Struggling to read in a complex file with parenthesis and many lines and spacings

I'm struggling with my current assignment at the moment, and I've been trying to find a solution for a while on reading in this complex file. I'm supposed to read it in to a tree. Preferably, I'd like to do that alphabetically, and if there's a way to do that, I'd appreciate some help. But for now, I'm just trying to get it into the nodes.

I looked up the documentation on using "file>>" but it doesn't really specify some of the more intricate details, such as which lines or characters will actually be read in. Spaces, returns, etc, are all a mystery to me. Here is what it looks like.

Matlock (1986-1995)
Mystery
http://www.imdb.com/title/tt0090481/
Andy Griffith
Nancy Stafford
Julie Sommars
Clarence Gilyard Jr.
Kene Holliday

For the algorithms I need to write for searching the tree, I'm going to need to somehow get the first full word into a string. The start year needs to go into an int, and the end year into an int, but how can I navigate around the parenthesis? Then I have to go to the next line to get the genre, then skip an entire line (I don't need the IMDB link), and then store all of the actors in their own strings. Each show has a different number of actors, so I'm going to need some sort of list or something that reads back into the tree so I can get "actor1, actor2," and so on. Those are the main parts I'm fairly clueless as to how to do them.

Here is my code for both a list class that I'm hoping to use to read out specific nodes later (and maybe I can use it to add actor names?) as well as the tree class.


TreeNode.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#pragma once
#include <iostream>
#include <string>

using namespace std;

struct BSTNode
{
	string showName;
	int startYear;
	int endYear;
	string genre;
	string actorName;

	BSTNode* left;
	BSTNode* right;
};



LinkedList.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#pragma once

#include <iostream>
#include <string>
#include "TreeNode.h"

using namespace std;

class LinkedList
{
	struct node
	{
		BSTNode linkedNode;
	};
};
Last edited on
...
Last edited on
In your file format, how do you know when one show ends, and the other show begins?
Can you show an example of your file that shows more than one show?

The text in your file is not whitespace-delimited, so using operator >> probably isn't the best way.
I would loop on getline here: http://www.cplusplus.com/reference/string/string/getline/

Here is one way to extract the start and end years, assuming you have "Show Name (1986-1995)" as a string already, from a call to getline().
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
using namespace std;

// e.g. show == "Show Name (1984-2020)"
// assumes :
// - last token will always be "(YR#1-YR#2)" with no inner whitespace
// - and each year is 4 digits long. (could be made more flexible if necessary)
void parse_show_years(const string& show, int& startYear, int& endYear)
{
    // convert text back into stream
    istringstream iss(show);
    
    // loop until we reach the last token, whitespace-delimited
    string years;
    while (iss >> years) { };
    
    cout << "years (unparsed) = " << years << '\n';
    
    // extract years as ints
    const int NumDigitsInYear = 4;
    string start_year_str = years.substr(1, NumDigitsInYear);
    string end_year_str = years.substr(1 + NumDigitsInYear + 1, NumDigitsInYear);
    
    cout << "start_year_str = " << start_year_str << '\n';
    cout << "end_year_str =   " << end_year_str << '\n';
    
    // convert individual year strings to ints
    startYear = stoi(start_year_str);
    endYear = stoi(end_year_str);
}

int main()
{
    string show_line = "Show Name (1984-2020)";
    int startYear;
    int endYear;
    
    parse_show_years(show_line, startYear, endYear);
    
    cout << "start year as int: " << startYear << '\n';
    cout << "end year as int:   " << endYear << '\n';  
}

years (unparsed) = (1984-2020)
start_year_str = 1984
end_year_str =   2020
start year as int: 1984
end year as int:   2020
Last edited on
Ganado, here is an example of the two shows side by side. Each show has a single empty line between it and the next show in the text file. I didn't think about reading it all in as a string as a whole line and then breaking it apart. In fact, I didn't even know that was an option! I'll have to do some research into getline() and parse() and see what I can come up with. I really appreciate the help. Hopefully, I'll be able to get some of it sorted out by tomorrow and get back to you with my progress.

Matlock (1986-1995)
Mystery
http://www.imdb.com/title/tt0090481/
Andy Griffith
Nancy Stafford
Julie Sommars
Clarence Gilyard Jr.
Kene Holliday

Northern Exposure (1990-1995)
Comedy
http://www.imdb.com/title/tt0098878/
Barry Corbin
Janine Turner
John Cullum
Darren E. Burrows
John Corbett
Cynthia Geary
Elaine Miles
Peg Phillips
Rob Morrow
Ganado, I went ahead and programmed all of that in. I was able to use getline to read individual lines of my file, but I'm unsure how to pass values to the ParseShowYears function that I have. I'm still not sure what the pass by reference operator does, but I don't have three separate variables to pass to ParseShowYears, which your function seems to have.

EDIT: I think I understand the pass by reference. Is it so that when you store a variable value to "startYear" and "endYear", the values get updated in the variable inside of the tree node? Also, I had an additional question about your code. What exactly is line 18 doing? Normally for a while loop, I see a condition followed by functionality. But yours doesn't seem to have functionality at all, and the condition is formatted in a way I've never seen before.

Here is my updated code:

Main.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream>
#include <string>
#include <fstream>
#include "LinkedList.h"

using namespace std;

int main()
{
	string line;

	ifstream myFile;
	myFile.open("tvDB.txt");
	if (myFile.is_open())
	{
		cout << "File has successfully opened." << endl;
	}

	else
	{
		cout << "ERROR - File has not opened." << endl;
	}

	while (getline(myFile, line))
	{
		cout << line << "\n";
	}

	//BinarySearchTree::ParseShowYears(const string & show, int& startYear, int& endYear);
}



FuncDef.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include "LinkedList.h"

using namespace std;


// Parse the new text so that the years are the proper variables
// Could this be used to get the alphabetical order as well?
void ParseShowYears(const string& show, int& startYear, int& endYear)
{
	// Convert text back into "stream"?
	istringstream iss(show);

	string years;
	while (iss >> years) {};

	// Extract years as ints
	const int NumDigitsInYear = 4;
	string startYearString = years.substr(1, NumDigitsInYear);
	string endYearString = years.substr(1 + NumDigitsInYear + 1, NumDigitsInYear);
	
	cout << "startYearString = " << startYearString << "\n";
	cout << "endYearString = " << endYearString << "\n";

	// Converts individual year strings to ints
	startYear = stoi(startYearString);
	endYear = stoi(endYearString);
}


// t is the new node?
void BinarySearchTree::AddNodeR(TreePtr &t, char newShowName)
{
	if (t == NULL)
	{
		TreePtr newPtr = new BSTNode;
		newPtr->showName;
		newPtr->startYear;
		newPtr->endYear;
		newPtr->genre;
		newPtr->actorName;
		newPtr->left;
		newPtr->right;
	}

	// How can I do the less than if it's a string? It needs the first character.
	//else if (newShowName <= t->showName)
	//	AddNodeR(t->left, newShowName);
	//else
	//	AddNodeR(t->right, newShowName);
}



TreeNode.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#pragma once
#include <iostream>
#include <string>

using namespace std;

class BinarySearchTree
{
private:
	struct BSTNode
	{
		string showName;
		int startYear;
		int endYear;
		string genre;
		string actorName;

		BSTNode* left;
		BSTNode* right;
	};

	typedef BSTNode* TreePtr;

	TreePtr rootPtr;

	void InitBST()
	{
		rootPtr = NULL;
	}

	void AddNodeR(TreePtr& t, char newShowName);

public:
	BinarySearchTree()
	{
		InitBST();
	}

	void ParseShowYears(const string& show, int& startYear, int& endYear);

	bool IsEmpty()
	{
		return (rootPtr == NULL);
	}

	void AddNode(int newData);
	void AddNodeR(int newData);

	void SearchNode(string searchKey);
	void DeleteNode(int val);
};



LinkedList.h (This one I'm still not sure how to integrate)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#pragma once

#include <iostream>
#include <string>
#include "TreeNode.h"

using namespace std;

class LinkedList
{
	struct node
	{

	};
};
Last edited on
Topic archived. No new replies allowed.