3

I was understanding the Rabin-Karp Algorithm from this website: https://www.geeksforgeeks.org/rabin-karp-algorithm-for-pattern-searching/

They had the following code in C++ for the algorithm:

#include <bits/stdc++.h> 
using namespace std; 
  
// d is the number of characters in the input alphabet  
#define d 256  
  
/* pat -> pattern  
    txt -> text  
    q -> A prime number  
*/
void search(char pat[], char txt[], int q)  
{  
    int M = strlen(pat);  
    int N = strlen(txt);  
    int i, j;  
    int p = 0; // hash value for pattern  
    int t = 0; // hash value for txt  
    int h = 1;  
  
    // The value of h would be "pow(d, M-1)%q"  
    for (i = 0; i < M - 1; i++)  
        h = (h * d) % q;  
  
    // Calculate the hash value of pattern and first  
    // window of text  
    for (i = 0; i < M; i++)  
    {  
        p = (d * p + pat[i]) % q;  
        t = (d * t + txt[i]) % q;  
    }  
  
    // Slide the pattern over text one by one  
    for (i = 0; i <= N - M; i++)  
    {  
  
        // Check the hash values of current window of text  
        // and pattern. If the hash values match then only  
        // check for characters on by one  
        if ( p == t )  
        {  
            /* Check for characters one by one */
            for (j = 0; j < M; j++)  
            {  
                if (txt[i+j] != pat[j])  
                    break;  
            }  
  
            // if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]  
            if (j == M)  
                cout<<"Pattern found at index "<< i<<endl;  
        }  
  
        // Calculate hash value for next window of text: Remove  
        // leading digit, add trailing digit  
        if ( i < N-M )  
        {  
            t = (d*(t - txt[i]*h) + txt[i+M])%q;  
  
            // We might get negative value of t, converting it  
            // to positive  
            if (t < 0)  
            t = (t + q);  
        }  
    }  
}  
  
/* Driver code */
int main()  
{  
    char txt[] = "GEEKS FOR GEEKS";  
    char pat[] = "GEEK"; 
        
      // A prime number  
    int q = 101;  
      
      // Function Call 
      search(pat, txt, q);  
    return 0;  
}  

What I didn't understand was this block of code:

            // We might get negative value of t, converting it  
            // to positive  
            if (t < 0)  
            t = (t + q);  

How can t ever be negative? What we subtract from t is always less than t and then we add something to it so where does the possibility of t beign negative even come from?

I tested the code without this if statement and it didn't work properly. The exepected output was :

Pattern found at index 0
Pattern found at index 10

But I got:

Pattern found at index 0
Ulrich Eckhardt
  • 15,392
  • 1
  • 25
  • 49
Arsh Sharma
  • 219
  • 1
  • 7
  • 1
    Forget that website. It demonstrates how *not* to write C++ code and has nothing to do with professional programming. [Why should I not `#include `?](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h) [Why is `using namespace std;` considered bad practice?](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice) – Evg Feb 24 '21 at 07:10
  • Maybe if you indent the second line of the code it becomes clearer? Why shouldn't `t` be negative? You can also set a breakpoint in that line to see when it gets triggered. – Ulrich Eckhardt Feb 24 '21 at 07:15
  • https://stackoverflow.com/questions/7594508/modulo-operator-with-negative-values – Aki Suihkonen Feb 24 '21 at 09:14

1 Answers1

1

Aki Suihkonen has it; with a positive modulus, the result is either zero or has the same sign as the dividend, whereas Rabin--Karp assumes that the result will always be nonnegative.

For example, if we do

t = 3
t = (t + 5) % 7
t = (t - 5) % 7

then the values are

(3 + 5) % 7 == 1
(1 - 5) % 7 == -4

which if we add 7 makes 3 as desired.

David Eisenstat
  • 52,844
  • 7
  • 50
  • 103