I Need guidance for Text Extraction (New to C++)

Pages: 123
Dear All,

I really need your expert advices on my following assignment.

I intend to compile a C++ program to read a text file containing a lot of text information.

I need to find a keyword in this text file and output the word that is next to it.

For example, the whole text string is (...the game is pokemon and etc).

Which function in C++ can i use to search the word "the game" and which function can i use to extract the word the to it "pokemon" and Cout it to my console.

So far i have look through strtok function but i don't think it works.

As I am really new to C++, I will need your advice to help me focus on the relevant function for this extraction.

Thank you very much in advance to anyone who are able to advice me on this. =)
Last edited on
You can read each word in the file and compare it to what you are looking for and if it matches, read the next words and display them
Hi Bazzy,

This way, can i use purely iostream for this? Do I need other function?
you need <fstream> to read files, <string> to use strings and <iostream> to display the output
but which function can i use for the extraction of keywords?
The >> operator
eg:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
string word_to_find = "foo";
ifstream file ( "textfile.txt" );
string temp;

while ( file.good() )
{
    file >> temp; // read word from file
    if ( temp == word_to_find ) // check the word
    {
        file >> temp; // read next word
        // here you may want to check if the word was read fine 
        cout << "the word following " << word_to_find << " is " << temp; // display stuff
        // if you want to continue the loop do nothing, if you want to stop it add break;
    }
}
Last edited on
Hmmm... I will try this out! I been trying to figure out on this for weeks. thanks so much bazzy! With people like you around, my nightmares with C++ will be lessen! If only my lecturer is as supportive as you.

Thanks man! God Bless You.
IT WORKS BAZZY! Thanks thanks! i will try more example using this format! THANKS! I deeply apprieciate it very much!

I will try myself more on this. But i really need advice and guidance from you if I stuck on something. Thanks!
Bazzy, could you not feed the file input stream to a string or something, and then use, for example:

1
2
3
if (yourString.find(wordToFind) != string::npos) {
// code to execute if word is found (read next word)
}


Your way is probably better, but you could use find() I think.
That would be hard for huge files which would require a lot of memory to be stored and would be a bit more tricky reading the next word
Hi Bazzy,

I have play around and testing with your example and also read out on strings funtions, but I am still figuring out on how to find more than one keyword together and displaying more than one word from the example that you given me. I try to add "they" + "are"; but they show me error.

Can you advice more on the strings function for this? I have tried reading the C++ for dummies but still having problems understanding strings.



string word_to_find = "they" + "are";


Thank you.
I suggest you using a list ( http://www.cplusplus.com/reference/stl/list/ )

eg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// #include <list>

// ... same declarations as before ...

list<string> words_to_find;
words_to_find.push_back("they"); // add a word to find
words_to_find.push_back("are");

list<string>::iterator i = words_to_find.begin(); 

while( file.good() )
{
    file >> temp;
    if ( *i != temp ) i = words_to_find.begin(); // restart checking if a word doesn't match to the sequence stored in the list
    if ( i == words_to_find.end() ) // all the words were right
    {
          i = words_to_find.begin(); // restart
          cout << temp; // show next word
    }
    ++i;
}
Wow Bazzy, this is something very new to me. I am not that fimiliar using list. Nevertherless, I need to read on this list string function and try it and test once again on my coding.

Thank you once more Bazzy. I will try to digest this and get back to u if i can't understand on the coding that you have shown.


I'm not so sure; if he hasn't learnt to use pointers yet in class; his teacher might not want him to use them. Then again if he shows he has done some learning in his spare time... I don't know...

Flubber; the best way to learn is to do. Rewrite all of what Bazzy wrote. Then, if you can rewrite all the comments correctly, without copying his, you know you've learnt how the program works.

Just make sure it doesn't look like exactly like Bazzy's code, but does the same thing. And that you know why. Other wise it might just be a coincidence.
Hi Chrisname,

Actually in my class, the requirement is to know C++. But I dont know why I have already starting this module with C++. Therefore I have no choice but to digest all the C++ basic and try to implement the solution in less than 1 month.

Thankfully to Bazzy, he help me to focus on the important functions for my assignments so that i can able to read, understand and implement with my own understandable codes.

I am left with 1 week till I got to submit this. I am glad that someone like Bazzy able to guide and advice me. Unlike my lecturer who seems to give us a "dont care attitude."

In the end I will want to understand what I am implementing so that I will not lose anything in my future C++ modules.

Thanks for all your advices.
Hi Bazzy,

I have also read about the getline functions. Please advice if this function fits my assignment so that I can focus on using it.

I am trying to output more than one words (sentence) for the cout after finding the keywords.

Thanks.

getline reads all the characters until a delimiting character is found ( for default '\n' ) if you want to show all the words present until the end of line/sentence you can use it. If you want to display the next 2/3 words, just read them using >> and then output them with cout
Thanks Bazzy. I will try the getline function and see if I can make it works.
Hi Bazzy,

I have managed to get the content of all the text in the file using the getline function as per below. But I am figuring out how to either use the >> operator as your first example or using the list function to display more than one words for my cout.

I am now concentrating on using 1 keywords for my extraction.

Below is my getline functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include <iostream> 
#include <fstream> 
#include <string> 


using namespace std;



int skip_comments(ifstream& file, char mark)
{

*/
    const int MAX_CHARS = 100;

   while (file.peek() == mark) {file.ignore(MAX_CHARS, '\n');}

    return 0;
}

int main()
{


    ifstream fin;  /
    fin.open("test.txt", ios::in);  

    if (!fin) // See if the file was opened sucessfully.
    {
        cout << "Can't open input file. Aborting!\n";
        return 1;
    }

  string sometext;

   getline(fin, sometext);
   while (!fin.fail())
   {
      cout << sometext << '\n';
      getline(fin, sometext);
   }
   if (fin.fail() && !fin.eof())
   {
      cout << "Error while reading file. Aborting!\n";
      fin.close();
      return 2;
   }

   return 0;
}
Last edited on
Please use [code][/code] tags when posting code

There are many ways of reading more words with >>
1
2
3
4
5
6
7
8
9
10
11
12
// 1
string word1, word2, word3;
file >> word1>> word2 >> word3;
// 2
string word[3];
file >> word[0] >> word[1] >> word[2];
// 3
int number_of_words = 3;
vector<string> words ( number_of_words );
for ( int i = 0; i < number_of_words; i++ )
    file >> words[i];
//... 
Pages: 123