Comparing all characters within two strings of different length

Pages: 12
let's say I have two strings:

str1: aaabcabaaaba
str2: bca

I would like to slide str2 over str1 to figure out if str2 matches str1.
So, here for example str2 would be the same at index 3,4,5 of str1.
But how would I do this? When I try to do a for loop, I only manage to compare the three first characters of string 1 and the three last, not the middle part.

Write a function that can check if str2 is found in str1 at a particular index that is passed as argument. When that is done you can just loop through all the indices in str2 and call the function with each index to find all the positions where str2 is found in str1.
Last edited on
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>
using namespace std;

void slider( const string &str1, const string &str2 )
{
   int n = str2.size();
   for ( int i = 0; i + n <= str1.size(); i++ )
   {
      cout << i << ": " << boolalpha << ( str1.substr( i, n ) == str2 ) << '\n';
   }
}

int main()
{
   slider( "aaabcabaaaba", "bca" );
}
Last edited on
Just use a simple .find(). Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <string>
#include <iostream>
#include <iomanip>

int main() {
	const std::string str1 {"aaabcabaaaba"};
	const std::string str2 {"bca"};

	if (auto fnd {str1.find(str2)}; fnd != std::string::npos)
		std::cout << "Found at position " << fnd << '\n';
	else
		std::cout << "Not found\n";
}



Found at position 3

There is already std::string::find(): http://www.cplusplus.com/reference/string/string/find/
Then repeat that if you need all matches:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>
using std::string;

void slider( const string &str1, const string &str2 )
{
    auto pos = str1.find( str2 );
    while ( pos != string::npos ) {
        std::cout << pos << '\n';
        pos = str1.find( str2, pos+1 );
    }
}

int main()
{
   slider( "aaabcabaaabca", "bca" );
}
Last edited on
If the location of all the occurrences is needed, then:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <vector>
#include <string_view>

auto slider(std::string_view str1, std::string_view str2) {
	std::vector<size_t> fnd;

	for (size_t pos {}; (pos = str1.find(str2, pos)) != std::string_view::npos; fnd.push_back(pos++));

	return fnd;
}

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (const auto& p : slider(str1, str2))
		std::cout << p << ' ';
}



3 10 16

This kind of text smacks of homework assignment. Hence, library functions like .find are out — OP has to write his own function to do it.

This kind of homework requires OP to think through using indices into arrays and how the indices relate when comparing different pieces of the arrays.
Why should .find() be out if this is homework? If they're learning C++ then .find() is the way to code this. Using indices et al is the C way. For C++ teach C++ (including containers and standard library). For C, teach C.
Last edited on
C++ Beginners should learn the C++ way first. The "bass-ackwards" way of doing it the C way should still be taught but in an intermediate level course. After all there is lots of "legacy code" to deal with if one becomes a professional programmer.

Most programming curricula are abysmally ancient and self-defeating.
This rather depends whether you are learning "programming" or learning "c++" (especially if they are non-CS students).

Too great a fixation on a c++-specific idiom may not help a student if they go off to work and have to use a different programming language.
If it's needed to be coded without using standard library, then using iterators possibly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <string>
#include <iostream>

auto sfind(std::string::const_iterator sb, const std::string::const_iterator se, char c) {
	for (; sb != se && *sb != c; ++sb);
	return sb;
}

auto sfind(const std::string& s1, const std::string& s2) {
	if (s2.empty())
		return s1.cbegin();

	for (auto itr {s1.cbegin()}; (itr = sfind(itr, s1.cend(), *s2.begin())) != s1.cend(); ++itr) {
		for (auto sc1 {itr}, sc2 {s2.cbegin()}; ; ) {
			if (++sc2 == s2.cend())
				return itr;

			if (*++sc1 != *sc2)
				break;
		}
	}

	return s1.cend();
}

int main() {
	const std::string str1 {"aaabcabaaaba"};
	const std::string str2 {"bca"};

	if (auto fnd {sfind(str1, str2)}; fnd != str1.cend())
		std::cout << "Found at position " << fnd - str1.cbegin() << '\n';
	else
		std::cout << "Not found\n";
}



Found at position 3

Too great a fixation on a c++-specific idiom may not help a student if they go off to work and have to use a different programming language.


Then they go and learn the other programming language.
@seeplus

The purpose of these assignments is to understand array indexing, not C/C++/Standard Llibrary zen. C or C++ is just the tool at hand. Any other language with arrays and array indexing could be used, but familiarizing the student with C and C++ syntax is beneficial as well, particularly when they need to do array indexing in some other language with C-like syntax, like Python, Java, Rust, Ruby, Squirrel, R, etc or similarly structured imperative languages, like Delphi, Swift, Go, etc.

Telling someone to buzz off because they are looking under the hood of your favorite tool is, frankly, bad mojo.

@George P

I don’t believe blanket generalizations. Most introductory programming courses are mislabeled as teaching, e.g. “C++”. The purpose of introductory courses is to teach people to understand what is going on, not to jump to becoming a magical <insert language here> wizard. Mastering a new language is easy beans. Getting a grip on computer-think is hard, and tons of people can’t do it.

And it doesn’t help them when they encounter <insert language here> purists with bad attitudes about their professor’s/school’s learning objectives.

Getting uppity about how to do it best in C++ serves nothing but misplaced ego.
Learning how things work translates to everything you’ll ever touch.
I agree with Duthomhas. When I started to learn programming in school we used old-school C++ with .h headers, etc. We used C strings and plain arrays a lot, and I became pretty good at using loops. We used a lot of non-standard functions for colour and positioning console graphics. It was fun, and I learned a lot of basic programming skills. The course was named "Programming A" and the focus was never really to learn C++. It was just an introduction before moving on to Java. This was around 2005, but still...
Last edited on
I recall algorithms being explained in pseudo-language. That did emphasize that focus was on the logic, the "computer-think". No confusion/mislabel implying "this is how we use <insert language here>". (Lectures were also handwritten on overheads -- no direct evidence that teacher had ever used a computer.)

Do we all assume that a student will grab-copy-paste-sans-comprehension the first thing they see and run?
Or, do we have faith that students will pose additional questions: "is that all?", "how does this do it?"
Well perhaps if the OP had stated this was homework (as there's been no follow-up from the OP we don't know - but their previous post didn't seem homework related...) and was expected to be done using xyz, or provided their code attempt first then replies could be tailored. A simple 'how do I do this' with no context will receive the replies it receives.... and no, I've never told anyone to 'buzz off'!

and as I have said previously, program design is important. A program should first be designed and then coded from the design. IMO teaching design seems to absent/minimal in many programming courses - which to me seems a lost opportunity. Design is very important - some might argue more so than coding. For my degree (many, many, many years ago) we were taught design along with programming in various languages. In the programming assignments, coding only carried 35% of the total marks available (50% design, 35% code, 10% testing, 5% documentation).
Last edited on
C++ has lapsed into gibberish for the most simple tasks.
Yes, the stringview code up above is amazing: its short, it uses the tools well, its really good code in every sense of 'modern' and elegant.
It is also completely incomprehensible to someone who does not know advanced C++ specifically. That does not make it 'bad', as long as the audience is a bunch of advanced c++ coders, but you sure can't hop to the post from google to grasp the algorithm being used if you were just looking for a how-to that could be recoded in say python by a guy 2 months into learning programming.

It is dangerously close to writing convoluted code to save space / type less for no other benefit. I am not sure where the line blurred on that, and maybe its just expectations ... do you expect everyone to know the whole c++ language 100% ... clearly, most just dabble, or most that come for help at least.

this isn't so bad because its a 5 line program no matter how you tackle it. But the gibberishy simple stuff then starts appearing in the middle of complex code as bits and pieces of much larger efforts... try to unravel things like that can be a headache.
I got nothing on that. Just a ramble I suppose :)
Last edited on
jonnin wrote:
C++ has lapsed into gibberish for the most simple tasks.


The Python code might look like
1
2
3
4
5
6
def slider( str1, str2 ):
    n = len( str2 )
    for i in range( len(str1) - n + 1 ):
        if ( str1[i:i+n] == str2 ) : print( i )

slider( "aaabcabaaaba", "aa" )



Even the more explicitly typed Fortran code is straightforward:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
program lastchance
   implicit none
   call slider( "aaabcabaaaba", "aa" )

contains

   subroutine slider( str1, str2 )
      character(len=*), intent(in) :: str1, str2
      integer n, i
   
      n = len( str2 )
      do i = 1, len(str1) - n + 1
         if ( str1(i:i+n-1) == str2 ) print *, i
      end do
   
   end subroutine slider
   
end program lastchance



I expect things like perl are even slicker (but I can't write perl).


Why is there such a need to make c++ look so incomprehensible in the name of "modern practice"?
Last edited on
Well to display the starting index's of matches, then possibly:

1
2
3
4
5
6
7
8
9
#include <iostream>
#include <string_view>

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (size_t pos {}; (pos = str1.find(str2, pos)) != std::string_view::npos; std::cout << pos++ << '\n');
}



3
10
16


or using std::search:

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
#include <algorithm>
#include <string_view>
#include <functional>

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (auto pos {str1.begin()}; (pos = std::search(pos, str1.end(), std::boyer_moore_horspool_searcher(str2.begin(), str2.end()))) != str1.end(); std::cout << pos++ - str1.begin() << '\n');
}


which, of course, are perfectly clear! Ahemm...
Last edited on
Fortran code


Well I won't have easily recognised that code as being Fortran! But there again I haven't used Fortran since about 1980 using Fortran 77. I guess languages sure do evolve...

There's always APL...
1
2
substr ← ⍸⍷
'bca' substr 'aaabcabaaabcazzzbcadd'

Last edited on
Pages: 12