find, replace, and count words and characters

Pages: 12
So my project includes 4 parts, find the lines a word is on, replace that word with another, count the characters in the file, and count the words in the file.

1. I finished the first part (finding words) works perfectly.
2. The second part, I feel like it is replacing the words in the string, but not actually in the file itself.
3. The character count always shows up as 0; its as if the loop doesn't even run.
4. Im not even sure where to start with counting the words.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <cstdlib> 
#include <iostream>
#include <string>
#include <fstream>
#include <cstring>
#include <conio.h>



using namespace std;

int main()
{   
    
    ifstream in_stream;           //declaring the file input
    string filein, search, str, replace; //declaring strings
    int lines = 0, characters = 0, words = 0; //declaring integers
    char ch;
      
    cout << "Enter the name of the file\n";   //Tells user to input a file name
    cin >> filein;                            //User inputs incoming file name
    in_stream.open (filein.c_str(), ios::in | ios::binary); //Opens the file
    
    
    //FIND WORDS
    cout << "Enter word to search (and replace): " <<endl; //Tells user to input word
    cin >> search; //User inputs word they want to search
    
    while (!in_stream.eof())  //Loop to find the lines where words are
    {getline(in_stream, str); //inputs information of file into string
     lines++;                 //Line counter
     if ((str.find(search, 0)) != string::npos) //If the word is found
        {
        cout << "found at line " << lines << endl; //Print the line 
        }
    }
    
    //REPLACE WORDS
    cout << "\nEnter the word you want to replace it with: " <<endl;
    cin >> replace;
    
   while (!in_stream.eof())  //Loop to find the lines where words are
    {getline(in_stream, str); //inputs information of file into string
     if ((str.find(search, 0)) != string::npos) //If the word is found
        {
        str.replace(lines, search.size(), replace);
        lines++; 
        }
    }
    
    //COUNT CHARACTERS

     while (!in_stream.eof())      
    {in_stream.get(ch);	        // get each character from the file
     cout << ch;
     characters ++;			// count the characters in the file
    
    }
     cout << "The number of characters is: " << characters << endl;
       
    //COUNT WORDS
    
     in_stream.close ();                //close the incoming file


    system("PAUSE");                     
    return EXIT_SUCCESS;    
}

closed account (zwA4jE8b)
As a human, how would you count how many words there are?

What do you look for to tell that one word has ended and another is beginning?

it is not counting the characters because you are already at the end of the file. Your first while loop
1
2
while (!in_stream.eof())  //Loop to find the lines where words are
    {getline(in_stream, str);
reads to the end... Therefor your second one exits immediately

use in_stream.seekg (0, ios::beg); to get back to the beginning.

As the program reads the the file the position pointer is moved. if the pointer is at the end of the file it will stay there until you either close and reopen the file, or tell the pointer where to go. i.e. seekg.

So every time after you read to the end of the file you need to reset the pointer.


Also I don't know if you are supposed to count spaces and newline characters but you might want to account for those.
Last edited on
makes sense, but im still having issues. Where exactly do i put the seekg?
ive put the .seekg code right before the //REPLACE WORDS and //COUNT CHARACTERS (outside the loops)

closed account (zwA4jE8b)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#include <cstdlib> 
#include <iostream>
#include <string>
#include <fstream>
#include <cstring>
#include <conio.h>



using namespace std;

int main()
{   
    
    ifstream in_stream;           //declaring the file input
    string filein, search, str, replace; //declaring strings
    int lines = 0, characters = 0, words = 0; //declaring integers
    char ch;
      
    cout << "Enter the name of the file\n";   //Tells user to input a file name
    cin >> filein;                            //User inputs incoming file name
    in_stream.open (filein.c_str(), ios::in | ios::binary); //Opens the file
    
    
    //FIND WORDS
    cout << "Enter word to search (and replace): " <<endl;
    cin >> search; //User inputs word they want to search
    
    while (!in_stream.eof())  
    {getline(in_stream, str); 
     lines++;                
     if ((str.find(search, 0)) != string::npos) 
        {
        cout << "found at line " << lines << endl;
        }
    }
    
    in_stream.seekg (0, ios::beg);  // the seek goes here to reset the pointer.....

    //REPLACE WORDS
    cout << "\nEnter the word you want to replace it with: " <<endl;
    cin >> replace;
    
   while (!in_stream.eof())  
    {getline(in_stream, str);
     if ((str.find(search, 0)) != string::npos) 
        {
        str.replace(lines, search.size(), replace);
        lines++; 
        }
    }
    
     in_stream.seekg (0, ios::beg);  // the seek goes here to reset the pointer.....
    //COUNT CHARACTERS

     while (!in_stream.eof())      
    {in_stream.get(ch);	  
     cout << ch;
     characters ++;		
    
    }
     cout << "The number of characters is: " << characters << endl;
       
    //COUNT WORDS
    
     in_stream.close ();               


    system("PAUSE");                     
    return EXIT_SUCCESS;    
}
Last edited on
closed account (zwA4jE8b)
also, you are correct about not replacing the words in the file.
you never reset 'lines' before you begin your second loop.

to replace a word in a file you can use the seekg, tellg and seekp, tellp functions to manipulate the file.
right, that's where I have them; but it still gives me the same result.
closed account (zwA4jE8b)
what is the issue? Is it counting the characters?
no, I've placed the seekg's but it still gives a 0 count.
if i move the character portion to the top, it works; but then nothing else works. So youre right about the problem, but the seekg doesnt seem to solve it
Last edited on
closed account (zwA4jE8b)
i guess getline messes with it for some reason i do not yet know.

clear the stream right before setting the pointer.. then it will work.

something about getline must set a flag.

Can anyone explain that??

Thanks,
Mike

EDIT: It's the eofbit isn't it.
Last edited on
i tried closing and reopening the instream after the while loops, but it still gives me the same issues
closed account (zwA4jE8b)
like i said, clear the stream/flags right before you use seek.
Last edited on
open and closed the file before each loop, putting the seek right before closing and right after opening the file. Still only performs the first task...
You are going to have problems with buffering using your current design. Consider a file composed of the following characters:
 H  e  l  l  o     w  o  r  l  d  ! \n  S  a  l  u  t  a  t  i  o  n  s  ! \n

And you wish to replace "world" with "Elizabeth". Writing the replaced line back to file you get:
 H  e  l  l  o     E  l  i  z  a  b  e  t  h \n  u  t  a  t  i  o  n  s  ! \n

and the next line you read would be "th".

It would be better to read the entire file into memory, make your changes there, than overwrite the original file with the new contents. Use a std::deque.

Also, there are a few issues you need to be aware of:

It is good that you are using ios::binary so that your seeks work properly, but that has the unfortunate effect of removing cross-platform text file newline handling. Reading the file entirely using the default text-mode fixes this problem also.

On line 21 you are inputting a std::string using the extraction operator. That's a no-no. Use the std::getline() function. (Remember, filenames may have spaces in them!)

Watch that your commentary is not misleading and that it does not repeat what the code already says. Comments should explain, not issue a play-by-play.

Make sure to check that things like opening the file worked properly. Also, never loop on EOF when using C++ iostreams. It will cause you problems.

Please don't use system("PAUSE");

I personally like the EXIT_SUCCESS and EXIT_FAILURE macros, but they are not defined for C++ unless you #include the C standard library... Every C++ program returns zero on success and nonzero on failure, and people will expect seeing that, so it is OK to just use 0 and 1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <deque>
#include <fstream>
#include <iostream>
#include <limits>
#include <string>

using namespace std;

int main()
{
    deque <string> text;      // this is the text of the file we will modify
    fstream        f;         // this is the file stream we'll use
    string         filename;
    string         s;         // various uses
    string         s_to_find;
    string         s_replacement;

    cout << "Enter the name of the file: ";
    getline( cin, filename );

    // Read the file into memory
    f.open (filename.c_str(), ios::in);
    if (!f.is_open())
    {
        cout << "Hey! I could not open that file!\n";
        return 1;
    }
    while (getline (f, s))
        text.push_back (s);
    f.close();

    // 'Word find' stuff
    cout << "Enter the word to search for (to be replaced): ";
    getline (cin, s_to_find);

    ...

    // 'Word find and replace' stuff
    cout << "Enter the word you want to replace it with: ";
    getline (cin, s_replacement);

    ...

    // Other stuff

    ...

    // Update the file on disk
    f.clear();
    f.open (filename.c_str(), ios::out | ios::trunc);
    for (deque <string> ::const_iterator line = text.begin(); line != text.end(); ++line)
      f << *line << "\n";
    f.close();

    // Keep the console window open long enough to see the output...
    cout << "Press ENTER to quit.";
    cin.ignore (numeric_limits <streamsize> ::max(), '\n');
    return 0;
}

Hope this helps.

[edit] Fixed error.
Last edited on
For your second issue, I suggest that you carry out every line in the file and save it into an array. You can then use that array to change whichever characters you want. You can then plug the arrays you carried out to the file itself. It will be more costly and consuming but it gets the job done to start with.
Er, isn't that what I suggested? (Except I use a deque<string> instead of an array.)
Duoas, you give a lot of information, and some of it is helpful, but my knowledge of c++ is a bit pre basic at best. I like the idea of dumping the info into memory, but I have no knowledge of vectors. How would I do this with arrays?
Also, since eof is a problem with looping, what do I put in the while loop part of my functions?
things to take into consideration of your project...

The logic of the original problem: Find and replace a word.. is this all occurrences with the same word or am I replacing every occurrence with a different word. This would change how your program is designed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <cstdlib> 
#include <iostream>
#include <string>
#include <fstream>
#include <cstring>
#include <conio.h>



using namespace std;

int main()
{   
    deque <string> text;      // this is the text of the file we will modify    
    ifstream in_stream;           //declaring the file input
    string filein, search, str, replace; //declaring strings
    int lines = 0, characters = 0, words = 0; //declaring integers
    char ch;
      
    cout << "Enter the name of the file\n";   //Tells user to input a file name
    cin >> filein;                            //User inputs incoming file name
    in_stream.open (filein.c_str(), ios::in | ios::binary); //Opens the file
    
    
    //FIND WORDS
    cout << "Enter word to search (and replace): " <<endl;
    cin >> search; //User inputs word they want to search

    //REPLACE WORDS
    cout << "\nEnter the word you want to replace it with: " <<endl;
    cin >> replace;
    
    while (!in_stream.eof())  
    {
          getline(in_stream, str); 
          lines++;                
          // if I need print the original it goes here...
          string::size_type nIndex = str.find(search);
          while(nIndex != string::npos)
          {
                 cout << "found in Line " << lines << " at Index " << nIndex;
                 // if nIndex is valid we have the start position of the word we want to replace.
                 // cut the piece out of the string and insert the new piece
                 // we should have the lengths of the search and the replace to make it easy.
                 nIndex = str.find(search);
          }
          // we don't care if we change the line or not at this point....
          // If I just need to print the modified line display here....
          text.push_back (str);
          // count the characters.
          string::iterator pos;
          for(str.begin(); str.end(); pos++)
          {
                characters++;
                if(*pos == ' ') // find a space...
                     words++;
           }
    }
    // if I need to write a new file .. I have the stuff buffered, dump the buffer.
    system("PAUSE");                     
    return EXIT_SUCCESS;    
} 
Last edited on
To replace every word in the file with the same word
To replace every word with the same word in the file the logic looks different again. I wish you would clear up what your thinking. It look similar to what i have up there. I just need to track the words start position and ending position for every word. Watch for a space like I saw there in the and update the indexes.

The code up above finds every occurrences of a given word and replaces them with a given word. It also counts words and characters in a file.

If I was doing every word, I need to find where the words break like any whitespace. That is search loop like I have above but instead of the finding a keyword I am looking for whitespace. I can maintain two integers for doing the math of the start and end positions.

The only question is about counting things, am I counting the original or the finished. In my example I count after I make changes.
Pages: 12