Comparing all characters within two strings of different length

Pages: 12
Apr 12, 2022 at 10:41am
let's say I have two strings:

str1: aaabcabaaaba
str2: bca

I would like to slide str2 over str1 to figure out if str2 matches str1.
So, here for example str2 would be the same at index 3,4,5 of str1.
But how would I do this? When I try to do a for loop, I only manage to compare the three first characters of string 1 and the three last, not the middle part.

Apr 12, 2022 at 10:47am
Write a function that can check if str2 is found in str1 at a particular index that is passed as argument. When that is done you can just loop through all the indices in str2 and call the function with each index to find all the positions where str2 is found in str1.
Last edited on Apr 12, 2022 at 10:47am
Apr 12, 2022 at 11:21am
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>
using namespace std;

void slider( const string &str1, const string &str2 )
{
   int n = str2.size();
   for ( int i = 0; i + n <= str1.size(); i++ )
   {
      cout << i << ": " << boolalpha << ( str1.substr( i, n ) == str2 ) << '\n';
   }
}

int main()
{
   slider( "aaabcabaaaba", "bca" );
}
Last edited on Apr 12, 2022 at 11:28am
Apr 12, 2022 at 11:31am
Just use a simple .find(). Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <string>
#include <iostream>
#include <iomanip>

int main() {
	const std::string str1 {"aaabcabaaaba"};
	const std::string str2 {"bca"};

	if (auto fnd {str1.find(str2)}; fnd != std::string::npos)
		std::cout << "Found at position " << fnd << '\n';
	else
		std::cout << "Not found\n";
}



Found at position 3

Apr 12, 2022 at 11:33am
There is already std::string::find(): http://www.cplusplus.com/reference/string/string/find/
Then repeat that if you need all matches:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <iostream>
#include <string>
using std::string;

void slider( const string &str1, const string &str2 )
{
    auto pos = str1.find( str2 );
    while ( pos != string::npos ) {
        std::cout << pos << '\n';
        pos = str1.find( str2, pos+1 );
    }
}

int main()
{
   slider( "aaabcabaaabca", "bca" );
}
Last edited on Apr 12, 2022 at 11:33am
Apr 12, 2022 at 12:06pm
If the location of all the occurrences is needed, then:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <vector>
#include <string_view>

auto slider(std::string_view str1, std::string_view str2) {
	std::vector<size_t> fnd;

	for (size_t pos {}; (pos = str1.find(str2, pos)) != std::string_view::npos; fnd.push_back(pos++));

	return fnd;
}

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (const auto& p : slider(str1, str2))
		std::cout << p << ' ';
}



3 10 16

Apr 12, 2022 at 3:27pm
This kind of text smacks of homework assignment. Hence, library functions like .find are out — OP has to write his own function to do it.

This kind of homework requires OP to think through using indices into arrays and how the indices relate when comparing different pieces of the arrays.
Apr 12, 2022 at 4:20pm
Why should .find() be out if this is homework? If they're learning C++ then .find() is the way to code this. Using indices et al is the C way. For C++ teach C++ (including containers and standard library). For C, teach C.
Last edited on Apr 12, 2022 at 4:21pm
Apr 12, 2022 at 4:33pm
C++ Beginners should learn the C++ way first. The "bass-ackwards" way of doing it the C way should still be taught but in an intermediate level course. After all there is lots of "legacy code" to deal with if one becomes a professional programmer.

Most programming curricula are abysmally ancient and self-defeating.
Apr 12, 2022 at 4:37pm
This rather depends whether you are learning "programming" or learning "c++" (especially if they are non-CS students).

Too great a fixation on a c++-specific idiom may not help a student if they go off to work and have to use a different programming language.
Apr 12, 2022 at 5:25pm
If it's needed to be coded without using standard library, then using iterators possibly:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <string>
#include <iostream>

auto sfind(std::string::const_iterator sb, const std::string::const_iterator se, char c) {
	for (; sb != se && *sb != c; ++sb);
	return sb;
}

auto sfind(const std::string& s1, const std::string& s2) {
	if (s2.empty())
		return s1.cbegin();

	for (auto itr {s1.cbegin()}; (itr = sfind(itr, s1.cend(), *s2.begin())) != s1.cend(); ++itr) {
		for (auto sc1 {itr}, sc2 {s2.cbegin()}; ; ) {
			if (++sc2 == s2.cend())
				return itr;

			if (*++sc1 != *sc2)
				break;
		}
	}

	return s1.cend();
}

int main() {
	const std::string str1 {"aaabcabaaaba"};
	const std::string str2 {"bca"};

	if (auto fnd {sfind(str1, str2)}; fnd != str1.cend())
		std::cout << "Found at position " << fnd - str1.cbegin() << '\n';
	else
		std::cout << "Not found\n";
}



Found at position 3

Apr 12, 2022 at 5:44pm
Too great a fixation on a c++-specific idiom may not help a student if they go off to work and have to use a different programming language.


Then they go and learn the other programming language.
Apr 13, 2022 at 12:55am
@seeplus

The purpose of these assignments is to understand array indexing, not C/C++/Standard Llibrary zen. C or C++ is just the tool at hand. Any other language with arrays and array indexing could be used, but familiarizing the student with C and C++ syntax is beneficial as well, particularly when they need to do array indexing in some other language with C-like syntax, like Python, Java, Rust, Ruby, Squirrel, R, etc or similarly structured imperative languages, like Delphi, Swift, Go, etc.

Telling someone to buzz off because they are looking under the hood of your favorite tool is, frankly, bad mojo.

@George P

I don’t believe blanket generalizations. Most introductory programming courses are mislabeled as teaching, e.g. “C++”. The purpose of introductory courses is to teach people to understand what is going on, not to jump to becoming a magical <insert language here> wizard. Mastering a new language is easy beans. Getting a grip on computer-think is hard, and tons of people can’t do it.

And it doesn’t help them when they encounter <insert language here> purists with bad attitudes about their professor’s/school’s learning objectives.

Getting uppity about how to do it best in C++ serves nothing but misplaced ego.
Learning how things work translates to everything you’ll ever touch.
Apr 13, 2022 at 7:30am
I agree with Duthomhas. When I started to learn programming in school we used old-school C++ with .h headers, etc. We used C strings and plain arrays a lot, and I became pretty good at using loops. We used a lot of non-standard functions for colour and positioning console graphics. It was fun, and I learned a lot of basic programming skills. The course was named "Programming A" and the focus was never really to learn C++. It was just an introduction before moving on to Java. This was around 2005, but still...
Last edited on Apr 13, 2022 at 7:42am
Apr 13, 2022 at 8:05am
I recall algorithms being explained in pseudo-language. That did emphasize that focus was on the logic, the "computer-think". No confusion/mislabel implying "this is how we use <insert language here>". (Lectures were also handwritten on overheads -- no direct evidence that teacher had ever used a computer.)

Do we all assume that a student will grab-copy-paste-sans-comprehension the first thing they see and run?
Or, do we have faith that students will pose additional questions: "is that all?", "how does this do it?"
Apr 13, 2022 at 10:00am
Well perhaps if the OP had stated this was homework (as there's been no follow-up from the OP we don't know - but their previous post didn't seem homework related...) and was expected to be done using xyz, or provided their code attempt first then replies could be tailored. A simple 'how do I do this' with no context will receive the replies it receives.... and no, I've never told anyone to 'buzz off'!

and as I have said previously, program design is important. A program should first be designed and then coded from the design. IMO teaching design seems to absent/minimal in many programming courses - which to me seems a lost opportunity. Design is very important - some might argue more so than coding. For my degree (many, many, many years ago) we were taught design along with programming in various languages. In the programming assignments, coding only carried 35% of the total marks available (50% design, 35% code, 10% testing, 5% documentation).
Last edited on Apr 13, 2022 at 10:23am
Apr 13, 2022 at 12:34pm
C++ has lapsed into gibberish for the most simple tasks.
Yes, the stringview code up above is amazing: its short, it uses the tools well, its really good code in every sense of 'modern' and elegant.
It is also completely incomprehensible to someone who does not know advanced C++ specifically. That does not make it 'bad', as long as the audience is a bunch of advanced c++ coders, but you sure can't hop to the post from google to grasp the algorithm being used if you were just looking for a how-to that could be recoded in say python by a guy 2 months into learning programming.

It is dangerously close to writing convoluted code to save space / type less for no other benefit. I am not sure where the line blurred on that, and maybe its just expectations ... do you expect everyone to know the whole c++ language 100% ... clearly, most just dabble, or most that come for help at least.

this isn't so bad because its a 5 line program no matter how you tackle it. But the gibberishy simple stuff then starts appearing in the middle of complex code as bits and pieces of much larger efforts... try to unravel things like that can be a headache.
I got nothing on that. Just a ramble I suppose :)
Last edited on Apr 13, 2022 at 12:36pm
Apr 13, 2022 at 4:26pm
jonnin wrote:
C++ has lapsed into gibberish for the most simple tasks.


The Python code might look like
1
2
3
4
5
6
def slider( str1, str2 ):
    n = len( str2 )
    for i in range( len(str1) - n + 1 ):
        if ( str1[i:i+n] == str2 ) : print( i )

slider( "aaabcabaaaba", "aa" )



Even the more explicitly typed Fortran code is straightforward:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
program lastchance
   implicit none
   call slider( "aaabcabaaaba", "aa" )

contains

   subroutine slider( str1, str2 )
      character(len=*), intent(in) :: str1, str2
      integer n, i
   
      n = len( str2 )
      do i = 1, len(str1) - n + 1
         if ( str1(i:i+n-1) == str2 ) print *, i
      end do
   
   end subroutine slider
   
end program lastchance



I expect things like perl are even slicker (but I can't write perl).


Why is there such a need to make c++ look so incomprehensible in the name of "modern practice"?
Last edited on Apr 13, 2022 at 4:30pm
Apr 15, 2022 at 12:00pm
Well to display the starting index's of matches, then possibly:

1
2
3
4
5
6
7
8
9
#include <iostream>
#include <string_view>

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (size_t pos {}; (pos = str1.find(str2, pos)) != std::string_view::npos; std::cout << pos++ << '\n');
}



3
10
16


or using std::search:

1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
#include <algorithm>
#include <string_view>
#include <functional>

int main() {
	const std::string_view str1 {"aaabcabaaabcazzzbcadd"};
	const std::string_view str2 {"bca"};

	for (auto pos {str1.begin()}; (pos = std::search(pos, str1.end(), std::boyer_moore_horspool_searcher(str2.begin(), str2.end()))) != str1.end(); std::cout << pos++ - str1.begin() << '\n');
}


which, of course, are perfectly clear! Ahemm...
Last edited on Apr 15, 2022 at 3:23pm
Apr 15, 2022 at 12:44pm
Fortran code


Well I won't have easily recognised that code as being Fortran! But there again I haven't used Fortran since about 1980 using Fortran 77. I guess languages sure do evolve...

There's always APL...
1
2
substr ← ⍸⍷
'bca' substr 'aaabcabaaabcazzzbcadd'

Last edited on Apr 15, 2022 at 1:06pm
Pages: 12