KMP algorithm's time complexity

Forum

Forum
General C++ Programming
KMP algorithm's time complexity

KMP algorithm's time complexity

I'm attempting to implement strstr using the KMP method. This is the algorithm described on scaler topics; I also watched this video https://www.scaler.com/topics/data-structures/kmp-algorithm/ to learn more about it. The temporal complexity of the KMP algorithm is O(n), where n is the length of the bigger string.

vector<int> KMP(string S, string K)
{
    vector<int> T(K.size() + 1, -1);
    vector<int> matches;

    if(K.size() == 0)
    {
        matches.push_back(0);
        return matches;
    }
    for(int i = 1; i <= K.size(); i++)
    {
        int pos = T[i - 1];
        while(pos != -1 && K[pos] != K[i - 1]) pos = T[pos];
        T[i] = pos + 1;
    }

    int sp = 0;
    int kp = 0;
    while(sp < S.size())
    {
        while(kp != -1 && (kp == K.size() || K[kp] != S[sp])) kp = T[kp];
        kp++;
        sp++;
        if(kp == K.size()) matches.push_back(sp - K.size());
    }

    return matches;
}

I'm not sure how the complexity of this algorithm is O. (n). Can somebody explain how this code's complexity is O(n)?

lastchance (6980)

Don't bother; this guy is a spammer.

seeplus (6653)

As for once I'm feeling in a goof mood..

The dominant step for constructing the partial match table is the character comparison between the pattern and the displaced pattern. This is executed m - 1 times where m is the length of K. Hence the algorithm is linearly dependent upon the size of K.

For the search part, the character comparison step again dominates. The number of times it is executed is determined by the rate at which sp and kp increase towards the string end. Since for a string S of length n there can be at most n shifts forward made either by the pattern or in the string, then at most 2n comparisons could be made. The search algorithm therefore has a linear performance of O(2n).

Hence the whole algorithm has a linear performance of O(m + 2n - 1). As 2n would usually be much greater than m, then the complexity is O(2n) which is O(n).

Also note that if K size > S size then K cannot possibly be found in S. Also, for there to be more than one instance of K in S then S size has to > K size. If K size = S size then there is only a possible one instance of K in S.

PS. Note that the posted code above has issues (signed/unsigned mismatch etc) and that S and K are passed by value rather than const ref (or std::string_view).

Last edited on

salem c (3723)

Spammer repost bot.
Original here - https://www.biostars.org/p/9550686/#9550688

Just copies random posts and pastes in the spammy URLs.

deleted account xyzzy (5768)

seeplus wrote:
As for once I'm feeling in a goof mood..

Typo??

Either way it is a rather good description of a mood....

Topic archived. No new replies allowed.

C++

Forum

KMP algorithm's time complexity