Program that detects repeated words

I don't understand in details how this little program detects adjacent repeated words in a sequence of words.
So my question is how the program proceed to do it ?
For example with the sequence of words : " She she laughed He He He because what he did did not look very very good good " ?

Why " She " or the word " laughed " are not detected as a repeated word since they are assigned in the previous variable ?

1
2
3
4
5
6
7
8
9
10
11
12
#include <iostream>

int main()
{
	std::string previous = " ";
	std::string current;
	while (std::cin >> current) {
		if (previous == current)
			std::cout << "repeated word:" << current << '\n';
		        previous = current;
	}
}
Last edited on
previous is always the previous word. When a new word (current) is obtained, it is compared to previous. If it is the same then it displays the message. If it isn't the same then no message is displayed. In both cases previous is then set to current and the while loop repeats until EOF (ctrl-z for Windows).
"She" and "she" are two different words because capitalization matters. To us hoo-muns 'S' and 's' are the same. How they are encoded via ASCII codes (or whatever is used) for computer use they are not the same. Two different byte representation.

Does 83 == 115?

http://www.cplusplus.com/doc/ascii/
@ seeplus

I still don't understand how the program check if there are repeated words for a sequence of words. I mean for each steps.

I see it like this :

previous = " ";
user input in the variable current = "the cat cat jumped like like this";

previous = current

previous = "the cat cat jumped like like this"
current = "the cat cat jumped like like this"

Then I find rather confusing what the code is doing.

@ George P

I understood that.
do you know how >> handles white space (spaces, end of line, etc)?
for the cat cat jumped like like this
it reads
prev ""
current the
...
prev the
current cat
...
prev cat
current cat <--- match
...
@ jonnin

It should read white space-separated words, right? meaning it only reads a single word?
It would explain why it only check one word at a time? Im not quite sure here.
Last edited on
right, one word at a time.
add a print for current, previous each loop iteration. see what is going on with prints...
@ jonnin

I added a print and it showed the same as you :

(user input) the cat cat jumped
the
thecat
catcat
repeated word:cat
catjumped

Is there a more meaningful reason on why one at a time in a string type?

Last edited on
I do not understand the question. you can read a line at a time, but for this task, you would then just have to go behind your read and split it up on the spaces. The problem it is solving involves chunks of letters (words) split by whitespace. The << operator helps you get there by splitting up the input exactly how you need it (and that is why it does it that way, its a common way to read data and provided to you for ease of use!). You can read a letter at a time too, with similar annoyances in getting it back into a form you can use.

<< is smart enough to read numbers (int, double) directly as well (often not useful due to needing to validate the input, but fine for machine to machine talk like reading the text spew off a GPS unit).

a string can hold reams of text with any sort of whitespace, you can put the entire oxford english dictionary into a single string most likely, if you have the memory for it. There is no limitation on strings; the behavior is tied to the << operator.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>

int main()
{
   std::string previous { "N/A" };
   std::string current  { "N/A" };

   std::cout << "Enter a string of words to compare (\"#\" to end):\n";

   while (std::cin >> current)
   {
      if (current == "#") return 0;

      std::cout << "\n\t" << "previous: " << previous << '\n';

      std::cout << '\t' << "current: " << current << '\n';

      if (previous == current)
         std::cout << "\nrepeated word:" << current << '\n';
      previous = current;
   }
}
Enter a string of words to compare ("#" to end):
 She she laughed He He He because what he did did not look very very good good #

        previous: N/A
        current: She

        previous: She
        current: she

        previous: she
        current: laughed

        previous: laughed
        current: He

        previous: He
        current: He

repeated word:He

        previous: He
        current: He

repeated word:He

        previous: He
        current: because

        previous: because
        current: what

        previous: what
        current: he

        previous: he
        current: did

        previous: did
        current: did

repeated word:did

        previous: did
        current: not

        previous: not
        current: look

        previous: look
        current: very

        previous: very
        current: very

repeated word:very

        previous: very
        current: good

        previous: good
        current: good

repeated word:good
@ jonnin

This is what I was asking for and needed. I was just confused at how the program was checking one word at a time. I learned new things.

Thank you for your help and time.

@ George P

Your code was helpful, I learned new things by trying it as well.

Thank you for your help and time.
It never hurts to display a prompt so the user has an idea what input is required, even if the person who wrote the code is the user. Seeing a blank console window with just a blinking prompt is ANNOYING!

I despise never ending loops that require some keyboard kung-fu gyrations like CTRL^Z to terminate. By looking for a character (or combination of characters) to signal the end of input is a lot less messy.
Last edited on
I despise never ending loops that require some keyboard kung-fu gyrations like CTRL^Z to terminate.
Programs like that are usually meant for data to be piped in from another program. e.g. grep
Last edited on
Programs like that are usually meant for data to be piped in from another program. e.g. grep

That I can understand and agree with.

Using that in an interactive user interface is bollocks was and is my point.

/rant :Þ
@ George P

I'm just starting to learn C++, I will make use of the prompt more often.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream>
#include <sstream>

int main()
{
    std::string previous;
    std::string current;
    
    std::string line;
    std::stringstream iss;
    
    std::cout << "Enter a line of text: ";
    std::getline(std::cin, line);
    iss << line;
    
    bool match{true};
    while (iss >> current)
    {
        match = true;
        if (current.length() == previous.length() )
        {
            // IGNORE CASE
            for(int i = 0; i < current.length(); i++)
            {
                if (tolower(current[i]) != tolower(previous[i]))
                    match = false;
            }
            
            // OUTPUT RESULT
            if (match == true)
            {
                std::cout
                << "(previous) " << previous << " <- repeated word -> "
                << current << " (current)\n";
            }
        }
        previous = current;
    }
}


Enter a line of text: She she laughed He He He because what he did did not look very very good good
(previous) She <- repeated word -> she (current)
(previous) He <- repeated word -> He (current)
(previous) He <- repeated word -> He (current)
(previous) did <- repeated word -> did (current)
(previous) very <- repeated word -> very (current)
(previous) good <- repeated word -> good (current)
Program ended with exit code: 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
#include <string>
#include <cctype>

int main() {
    size_t cnt {};

    std::cout << "Enter a string of words to compare (\"#\" to end):\n";

    for (std::string current, previous; (std::cin >> current) && current != "#"; previous = current) {
        if (previous.size() == current.size()) {
            bool same {true};

            for (size_t i {}; same && i < previous.size(); ++i)
                same = std::tolower(static_cast<unsigned char>(previous[i])) == std::tolower(static_cast<unsigned char>(current[i]));

            if (same) {
                if (cnt++ == 0)
                    std::cout << "Repeated word:" << current << '\n';

                continue;
            }
        }
        cnt = 0;
    }
}



Enter a string of words to compare ("#" to end):
She she laughed He He He because what he did did not look very very good good #
Repeated word:she
Repeated word:He
Repeated word:did
Repeated word:very
Repeated word:good


Which only displays the repeated word once no matter how many times it is consecutively repeated.
Topic archived. No new replies allowed.