Really serious problem C++ even hard to explain

Nov 23, 2013 at 9:21pm
So I'm having a headache with this ugly bug... I need to count how many words (which are longest in txt files) in both txt files (CD1 and CD2) repeated... I know it sounds easy but I don't understand whats wrong! Please help me or give an advice how to do that. Thanks

Deividas

Here is my txt files CD1:
Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind

CD2:
Be who you are and say only what you really feel, because those who mind don't matter


And here is my code fragments (sorry I'm from Lithuania, so English is not my native language)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void Read(const char CD[], string & eil, string & skyr){
	ifstream fd(CD1); // Kiekis CD2 !!!
	ofstream rf(RF);
	ofstream rf2(RF2);
	string max[10]; int ind = 0;
	rf << string(35, '-') << "Pradiniai duomenys" << string(35, '-') << endl;
	while(!fd.eof()){
		getline(fd, eil);
		rf << eil << endl;
		Less(eil);
		AnalyzeEil(eil, skyr, ind, max);
	}
	for(int j=0; j<ind; j++){
		if(Find(max[j]) == true){
		cout << max[j] << " " << max[j].length() << endl;
		}
	}
	fd.close();
	rf.close();
	rf2.close();
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void Read2(const char CD2[], string & eil2, string & skyr){
	ifstream fd(CD2);
	ofstream rf(RF, ios::app);
	ofstream rf2(RF2, ios::app);
	rf << endl << string(70, '-') << endl;
	while(!fd.eof()){
		getline(fd, eil2);
		rf << eil2 << endl;
		Less(eil2);
	}
	fd.close();
	rf.close();
	rf2.close();
}

1
2
3
4
5
void Less(string & eil){
	for(int i=0; i<eil.length(); i++){
		eil[i] = tolower(eil[i]);
	}
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void AnalyzeEil(string eil, string skyr, int & ind, string max[]){
string zodis;
    int zpr = 0, zpb = 0;
    while ((zpr = eil.find_first_not_of(skyr, zpb)) != string::npos){
        zpb = eil.find_first_of(skyr, zpr);
        zodis = eil.substr(zpr, zpb - zpr);
	    if(ind!=10){
			max[ind++] = zodis;
			//cout << Count(max[ind]) << endl;
		}else{
			for(int i=0; i<ind; i++){
				if(max[i].length() < zodis.length()){
					max[i] = zodis;
					//cout << Count(max[i]) << endl;
					break;
				}
			}
		}
	}
}

1
2
3
4
5
6
7
8
9
10
11
12
bool Find(string f){
	ifstream fd(CD2);
	string eil;
	while(!fd.eof()){
		getline(fd, eil);
		if(tolower(eil.find(f) != -1)){
			return true;
		}
	}
	fd.close(); 
	return false;
}

1
2
3
4
5
6
7
8
9
10
11
12
int Count(string k){
	ifstream fd2(CD2);
	string eil2;
	int kiek = 0;
	while(!fd2.eof()){
		getline(fd2, eil2);
		if(eil2.find(k) != -1){
			kiek++;
		}
	}
	return kiek;
}


Please somenone help me I think that my function Count is not correct!
Last edited on Nov 23, 2013 at 9:23pm
Nov 23, 2013 at 9:51pm
Please somenone help me I think that my function Count is not correct!


No, it's not.

http://www.cplusplus.com/reference/string/string/find/
Nov 23, 2013 at 9:59pm
Funtion Count counts each line of text that contains at least one occurence of k. If there is more than one match per line, the extras are ignored, because find only goes once per getline.
Nov 23, 2013 at 10:21pm
Do one thing, and do it well

Read()
a ¿why is it opening two files for writing?
b The first argument is unused
c Don't loop on eof, use the reading operation instead while( getline(fd, eil) )
d `eil' does not make sense as an argument, but as a local variable
e If you don't pretend to modify `styr', pass it as const std::string &
f 13-17: ¿output? that shouldn't be here.
g 18-20: the destructors would take care of that.

Read2()
shares problem a, c, g with Read()
¿how is this different than `Read()'? (¿why should be different?)
You are overwriting the variable. At the end you'll only have the last line.

AnalyzeEil()
¿what is the purpose of this function?
10-18: ¿what is the purpose of the `else' block?

Find()
problem c
if(tolower(eil.find(f) != -1)){ ¿what?
you shouldn't need to be reopening the file over and over again.

Count()
problem c
.find() returns std::string::npos if it doesn't find the string. It may not be the same as -1



> I need to count how many words (which are longest in txt files)
> in both txt files (CD1 and CD2) repeated...
I don't understand your description
¿what would be the desired output?
Nov 24, 2013 at 10:00am
Thank you all for reply. I need that in both txt files longest words reapeted and then i need to count their amount so for example my txt files

Be who you are and say what you feel, because those who mind don't matter, and those who matter don't mind


Be who you are and say only what you really feel, because those who mind don't matter

So my output should be (repeated in both files)
because 1 time
matter 1 time
who 2 time
and so on.
Last edited on Nov 24, 2013 at 10:01am
Nov 24, 2013 at 11:22am
To JockX, thanks for reply. It is possible somehow to change my Count function? I'm really confused becouse I have no idea how i need to change that. Generraly speaking is it possible to count words on my algorythm?
Thanks

Deividas
Nov 24, 2013 at 1:05pm
To fix your function, make sure you repeat the eil2.find() as long as it finds something. But the next find() should start searching eil2 from the position where the previous k was found, so you may use the other version of find, that takes two arguments:
1
2
3
4
5
6
7
8
9
10
11
12
while(!fd2.eof()){            // Your loop
    size_t startAt = 0;
    while (true){             // My loop
        startAt = text.find(k, startAt); // start searching at positon startAt
        if( startAt != string::npos){ // if hit something
            startAt += k.length(); // move startAt to the position just after k
            kiek ++;
            continue;
        }
        break;
    } // my loop
} // your loop 

It would be even easier, if the function accepted entire content of file as one string, instead of processing it line by line, which unnecessarily adds complexity.
Nov 24, 2013 at 1:56pm
Thanks for reply.
It would be even easier, if the function accepted entire content of file as one string, instead of processing it line by line, which unnecessarily adds complexity.

But what if txt document has a lot of lines ? My teacher don't let us to do that, because program could break. I don't know why but when I change my function I get endless loop.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int Count(string k){
	ifstream fd2(CD2);
	string eil2;
	int kiek = 0;
	while(!fd2.eof()){
		size_t startAt = 0;
		getline(fd2, eil2);
		while(true){
			startAt = eil2.find(k, startAt);
			if(startAt != string::npos){
				startAt = startAt + k.length();
				kiek++;
				continue;
			}
			break;
		}
	}
	return kiek;
}
Nov 24, 2013 at 3:26pm
Any ideas what's wrong?
Last edited on Nov 24, 2013 at 3:27pm
Nov 24, 2013 at 4:19pm
> I get endless loop.
run through a debugger
interrupt your program
checkout the variables

(k may be empty)
Nov 24, 2013 at 7:16pm
I have realised why I get endless loop. Unfortunately my program still doesn't work :(. When I write
Count (eil) in my Read function I get results 0 0.
Nov 24, 2013 at 8:12pm
Read() should not count. Read() should read, period.
I would suggest to put all the words in a container (like std::vector), and pass that to the other functions.

Update your code
Also, you can test each function individually.
Nov 24, 2013 at 8:31pm
To ne555 thanks for reply, but my teacher said that put all words in a container or array is not a good thing, because in my txt file could be millions of lines, so program can break. So I need a better iea how can I solve this uggly bug.
Nov 24, 2013 at 9:04pm
You don't need a `Read()' function then (or change its name to something more meaningful)
You could simply work word by word instead of reading lines. That would simplify your `Count()' function to a sane level.


Edit: I still don't understand the `longest words repeated' part
In your example `who' has length 3, ¿why is it considered "longest" ?
Also `matter' appears two times in the first sentence, and one in the second. ¿why is your output 1?
¿do you want to simply count each word in the other file?
1
2
3
4
5
6
while read word from input1
   count=0
   while read aux from input2
      if word==aux
         ++count
   print word "appeared" count "times"
Last edited on Nov 24, 2013 at 9:10pm
Nov 25, 2013 at 3:19pm
ne555 again thaks for quick reply, here is my full code http://pastebin.com/5TpWJiFs you see I try to seperate my line(eil) into Max function.
Nov 25, 2013 at 3:26pm
To ne555. I need to find 10 longest words in BOTH FILES. I need to find longest words from textfile 1 ant later from text file 2 and if they are the same I need to ciunt them.
Nov 25, 2013 at 7:38pm
The 10 longest in each file, and then count the mutual
Or, take all the mutual, and then count the 10 longest


I suppose that `Max()' should obtain the 10 longest words, but it is incorrect.
You simply overwrite any word that has less length than the one being tested.

Suppose a simplified situation where you are interest in the 2 longest words. So far you have a = "1234" and b = "12".
It comes a test string test = "123456", your algorithm will do
1
2
a = "123456";
b = "12";
but the correct would be
1
2
a = "123456";
b = "1234";


You need to maintain the array sorted, each time you test a new world, you `insert' it in the place so the array remains sorted (like insertion sort). Could also avoid repeated elements here.


I don't see you using `Count()' anywhere.
Again, simplify your functions so they do just one thing and test them separately.
Nov 26, 2013 at 7:46pm
ne555, thanks again, here is my new algorythm witch really counts the mutual words
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
void Mutual(string longestWords[], int & n, const char CD1[], const char CD2[], int longest){
	string eil;
	string zodziai[CMAX];
	while (longest != 0 || n < 10){
		ifstream fd(CD1);
		while (!fd.eof() && longest != 0 && n < 10){
			int kiek = 0;
			getline(fd, eil);
			Less(eil);
			AnalyseEil(eil, zodziai, kiek);
			for(int i = 0; i < kiek; i++){
				if(zodziai[i].length() == longest && NeraIrasytas(ilgiausiZodziai, n, zodziai[i]) && n < 10){
					if(YraAntrameTekste(CD2, zodziai[i])){
						longestWords[n++] = longest[i];
						 Kiek(ilgiausiZodziai);
					}
				}
			}
		}
		fd.close();
		longest--;
	}
}


But I still can't count the words countity in files :( How I need to do that?
Nov 27, 2013 at 8:29am
?
Topic archived. No new replies allowed.