Search specific pair/triple word in a paragraphs

Hi you all, i have a function that searches paragraphs in a file and compares whether within that particular paragraph, there is word or pair/triple-word. Up to here works well if you pass for example "John", the problem I have is when i give in console, for example "John Doe" or "John Doe Doll". Always find only "John" I whant to find pairs/triples or more words, always taking care the case-sensitive, since it is not the same "John Doe Doll" that "john doe doll". Must be the exact search.. Then, the function i have:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
string findParagraphs( istream &in ) {
	string searchWord;
	string result, line, resultMATCH;
	string::size_type position, found;
	const string LF("\n");
	const string CR("\r");
	const string DOT(".");
	const string SPACE(" ");
	const string NUL("\0");
	string whitespaces ("\r\n\v\f\t");
	bool IsParagraph;
	int numline=0, paragraph=0, paragraphMATCH=0;
	unsigned int place;
	
	searchWord = wordsToFind(); // Here i pass the words

	while (getline( in, line )) {
			IsParagraph = false;
			if (line.empty()) {
				line.clear();
			}
			found=line.find_last_not_of(whitespaces);
			if (found!=string::npos)
				line.erase(found+1);
			else {
				line.clear(); 
			}
				
			position = line.find_last_of(DOT);
			//cout << "DOT found in position " << position << endl; // Comprobation
			if ((position != string::npos && line.substr(position+1)==SPACE && line.substr(position+2)==NUL) ||
				(position != string::npos && line.substr(position+1)==SPACE && line.substr(position+2)==CR)) {
				
				paragraph++;
				
				IsParagraph = true;
			} 
				
			if (IsParagraph){
				// Store the Paragraph in new lines "LF"
				result += line +LF;
				place = result.find(searchWord); // HERE THE PROBLEM!!!...
				if (place != string::npos) {
					paragraphMATCH++;
					//Store only the paragraphs that match the search words 
					resultMATCH += result; 
				}
				// Clean result to avoid duplicates
				result.clear();
			} else {
                                // Concatenate lines that have no end of paragraph
				result += line +SPACE;
				result.erase( result.length() -1 );
			}
		numline++;
	}
		cout << " ===================================== " << endl;
		cout << "+Total number of lines analyzed: " << numline << endl;
		cout << "+Total number of paragraphs: " << paragraph << endl;
		cout << "+Paragraphs that contain the search: " << paragraphMATCH << endl;
		cout << " ===================================== " << endl;
	return resultMATCH;
}

The function plays the role of finding paragraphs but takes only the 1st word entered "John"..

For example:
John - find in 122 paragraphs
John Doe - find in 122 paragraphs
John Doe Doll - find in 122 paragraphs
Doe Doll - find in 233 paragraphs

But the words "John Doe Doll" together, are in 80 paragraphs. :/

Apreciate any help... Thanks in advance!!!....

Mac
Try to use strstr instead of find and others...


Hi sandeepdas, thanks but, with the strstr have the same problem

1
2
3
4
5
6
if (strstr(result.c_str(), searchWord.c_str())) {
	paragraphMATCH++;
	//Store only the paragraphs that match the search words
	resultMATCH += result; 
}
			


"John" - find in 122 paragraphs
"John Doe" - find in 122 paragraphs
"John Doe Doll" - find in 122 paragraphs
"Doe Doll" - find in 233 paragraphs

Dont search pairs/triple words. Any sugestion?

Cheers!!...

Mac

Last edited on
> The function plays the role of finding paragraphs but takes only the 1st word entered "John"..

How have you implemented wordsToFind()?

Like this: std::string str ; std::cin >> str ;

Or like this: std::string str ; std::getline( std::cin >> std::ws, str ) ;?
Hi JLBorges.. The function wordsToFind() is

1
2
3
4
5
6
string wordsToFind() {
	string search;
	cout<< "Words to find: ";
	cin >> search;
return search;
}

Last edited on
Change cin >> search; to getline( cin >> ws, search );

Your program will then make an exact search (char by char including spaces) for a phrase.
For example John Doe Doll would match John Doe Doll, but not John.Doe.Doll or John Doe, Doll.
Last edited on
Nice!!.. thanks JLBorges!!.. this solved the problem and work with both:
1
2
3
4
5
if (strstr(result.c_str(), searchWord.c_str())) {
	paragraphMATCH++;
	//Store only the paragraphs that match the search words
	resultMATCH += result; 
}

1
2
3
4
5
6
place = result.find(searchWord); 
if (place != string::npos) {
        paragraphMATCH++;
	//Store only the paragraphs that match the search words 
	resultMATCH += result; 
}


What you say is safer?
> What you say is safer?

Either is safe. Since you are programming in C++, with std::string, stick to std::string::find. Though you do not need either its flexibility or its robustness in this particular case.
Thanks JLBorges!!!... A really good and fast solution....

Cheers!!!!!...

Mac
Topic archived. No new replies allowed.