I am very new in programming and I am trying to solve one of longest common sequence/substring problems in Java. So the algorithm question I am working on is to find longest common substring without cutting words.
For instance: given string1 = He had 3 pennies and 5 quarters
and string2 = q3nniesp
should return pennies
.
Other example: string1 = They named the new place face cafe
and string2 = e face
and output will be e face cafe
.
I am trying to figure this algorithm out but I can't decide if I need to convert these to char array or evaluate it as string. The way both strings can have spaces are confusing me.
I followed some of the existing stackoverflow questions and trying to modify this code from https://www.geeksforgeeks.org/
:
static String findLongestSubsequence(String str1, String str2) {
char[] A = str1.toCharArray();
char[] B = str2.toCharArray();
if (A == null || B == null) return null;
final int n = A.length;
final int m = B.length;
if (n == 0 || m == 0) return null;
int[][] dp = new int[n+1][m+1];
// Suppose A = a1a2..an-1an and B = b1b2..bn-1bn
for (int i = 1; i <= n; i++ ) {
for (int j = 1; j <= m; j++) {
// If ends match the LCS(a1a2..an-1an, b1b2..bn-1bn) = LCS(a1a2..an-1, b1b2..bn-1) + 1
if (A[i-1] == B[j-1]) dp[i][j] = dp[i-1][j-1] + 1;
// If the ends do not match the LCS of a1a2..an-1an and b1b2..bn-1bn is
// max( LCS(a1a2..an-1, b1b2..bn-1bn), LCS(a1a2..an-1an, b1b2..bn-1) )
else dp[i][j] = Math.max(dp[i-1][j], dp[i][j-1]);
}
}
int lcsLen = dp[n][m];
char[] lcs = new char[lcsLen];
int index = 0;
// Backtrack to find a LCS. We search for the cells
// where we included an element which are those with
// dp[i][j] != dp[i-1][j] and dp[i][j] != dp[i][j-1])
int i = n, j = m;
while (i >= 1 && j >= 1) {
int v = dp[i][j];
// The order of these may output different LCSs
while(i > 1 && dp[i-1][j] == v) i--;
while(j > 1 && dp[i][j-1] == v) j--;
// Make sure there is a match before adding
if (v > 0) lcs[lcsLen - index++ - 1] = A[i-1]; // or B[j-1];
i--; j--;
}
return new String(lcs, 0, lcsLen);
}
But I keep getting wrong outputs. For instance first output gives output = 3nnies
, I am really stuck at this point, can anybody gives a hand or a little scratch of algorithm? Thank y'all.