2

So I want to create a function using C to find the longest repeated non overlapping substring in a given string. For example: input banana. Output: an.

I was thinking using comparison of the array of the string and checking for repeats. Is that a viable approach? How would I be able to compare substrings with the rest of the strings. I want to avoid using suffix trees if possible

#include <stdio.h>
#include <string.h>

void stringcheck(char a[],int len, int s1, int s2)
{

    int i=s1+1;
    int j=s2+1;
    if(j<=len&&a[i]==a[j])
    {
        printf("%c",a[i]);
        stringcheck(a,len,i,j);
    }

}
void dupcheck(char a[], int len, int start)
{
    for(int i=start;i<len-1;i++)
    {
       for(int j=i+1;j<=len;j++)
       {
           if(a[i]==a[j])
           {
               printf("%c",a[i]);
               stringcheck(a,len,i,j);
               i=len;
           }

       }
    }
}


int main()
{
    char input[99];
    scanf("%s",input);
    int start=0;
    int len =strlen(input);
    dupcheck(input,len,start);
    return 0;

}
tyc72
  • 65
  • 6
  • There's nothing in here about a linear recursive sequence. Probably best to edit the title for clarity. I can't be the only one for whom that is the natural expansion of "LRS"! – William Pursell Jan 27 '20 at 17:40

1 Answers1

3

Yes, this is a valid approach.
You can compare the string - character by character, that way no need to truly save a substring.

You can see a dynamic solution using c++ taking that approach here: https://www.geeksforgeeks.org/longest-repeating-and-non-overlapping-substring/
This solution can be converted to c without many changes.

Another variant if the option is to save the substring by its' indexes.
You can then compare it against the string, and save the max substring, however this will take O(n^3) when the above solution does it in O(n^2).

edit: I converted the solution to c:

#include <stdio.h>
#include <string.h>

void longestRepeatedSubstring(char * str, char * res) 
{ 
    int n = strlen(str);
    int LCSRe[n+1][n+1];
    int res_length  = 0; // To store length of result
    int i, j, index = 0;

    // Setting all to 0 
    memset(LCSRe, 0, sizeof(LCSRe)); 

    // building table in bottom-up manner 
    for (i=1; i<=n; i++) 
    { 
        for (j=i+1; j<=n; j++) 
        { 
            // (j-i) > LCSRe[i-1][j-1] to remove 
            // overlapping 
            if (str[i-1] == str[j-1] && 
                LCSRe[i-1][j-1] < (j - i)) 
            { 
                LCSRe[i][j] = LCSRe[i-1][j-1] + 1; 

                // updating maximum length of the 
                // substring and updating the finishing 
                // index of the suffix 
                if (LCSRe[i][j] > res_length) 
                { 
                    res_length = LCSRe[i][j]; 
                    index = (i>index) ? i : index; 
                } 
            } 
            else
                LCSRe[i][j] = 0; 
        } 
    } 

    // If we have non-empty result, then insert all 
    // characters from first character to last 
    // character of string
    j=0;
    if (res_length > 0) {
        for (i = index - res_length + 1; i <= index; i++) {
            res[j] = str[i-1];
            j++;
        }
    }
    res[j]=0;
} 

// Driver program to test the above function 
int main() 
{ 
    char str[] = "banana";
    char res[20];
    longestRepeatedSubstring(str, res);
    printf("%s",res); 
    return 0; 
} 
dani39
  • 128
  • 8
  • If I compare it character by character. Let’s say I compare the character in the i position with a outer loop , to find a match in position j. I would compare the i+1 character with the j+1 character. If that doesn’t match. I would proceed with loop comparison with the i+1 character? – tyc72 Jan 27 '20 at 04:51
  • No. If you go with the O(n^3) solution than you should go back to i and compare it with j+1 in order to search for anothet match. If you go with the O(n^2) solution you won't compare i+1 at that point, as you fill a sort of map you'll only compare i+1 after you are done comparing i with everyone. – dani39 Jan 27 '20 at 05:43
  • But you go back to i and compare to j+1, wouldn’t you be just comparing the same character at i and not a new character in the substring? – tyc72 Jan 27 '20 at 06:15
  • I am assuming i is index of substring. So j+1 is a new character, first time compared to i. – dani39 Jan 27 '20 at 06:37
  • right let say a[6] = "banana" , at index of the first 'a' which is a[1], ran and compared to find a match at the second 'a' which is a[3]. if i went back to the i position which is a[1] and compared j+1 which is a[4], it wouldnt match. wouldnt I try to match with the i+1 position a[2] to see if it matches with j+1 which is a[4]? – tyc72 Jan 27 '20 at 06:48
  • Ofcourse first you try to match i+1 to j+1, only if you don't succeed you go back, if you succeed you continue, save you attemp until you don't succeed and then go back. This is in the O(n^3) solution. – dani39 Jan 27 '20 at 07:04
  • 1
    The basic idea is good, I don't think you need to break, there might be a longer match ahead. Obviously you need a counter, a max and a string to save the max. – dani39 Jan 27 '20 at 07:29
  • I am beginning to think maybe the dynamic method you mentioned is better, but i am not sure i cant convert that code from c++ to c – tyc72 Jan 27 '20 at 07:38
  • I think it's only changing the string related stuff. It isn't a lot. – dani39 Jan 27 '20 at 13:20
  • @tyc72 See edit, I have converted it to c. If you believe this has solved your issue, please mark it. – dani39 Jan 27 '20 at 17:39