Questions tagged [lcs]

lcs or Longest Common Subsequence is a problem in search optimization: Given two strings, find the common subsequence in given strings with maximum length. The problem can be solved in polynomial time using dynamic programming approach.

Source: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

About

lcs or Longest Common Subsequence is a problem in search optimization: Given two strings, find the common subsequence in given strings with maximum length. The problem can be solved in polynomial time using dynamic programming approach. The algorithm for solution to this problem is recursive and gives the following recursive formula.

   If i == 0 or j == 0                          then                           C[i][j] = 0
   If i,j > 0 and xi == yi                      then                           C[i][j] = c[i-1,j-1]+1
   If i,j > 0 and xi != yi                       then                           C[i][j] = max(c[i,j-1],c[i-1,j ])


Pseudocode

 p = A.length
 q = B.length
 for i = 1 to p
     c[i,0] = 0
 for i = 1 to q
     c[0,j] = 0
 for i = 1 to p
     for j = 1 to q
          if i ==0 or j == 0
              c[i][j] = 0
          else if(A[i] == B[j] ) 
              c[i][j] = c[i-1][j-1] + 1; 
          else 
              if(c[i][j-1]>c[i-1][j])
                   c[i][j] = c[i][j-1];
              else
                   c[i][j] = c[i-1][j];
  return c

Application

In Bioinformatics, the comparison of two DNA strands and similarity in these strands is given by this algorithm by computing longest common subsequence .


Example

  • A = A T G C G T C G A T
  • B = A T G T G A C T A G

LCS

Longest Common Subsequence is of 7 characters and it is A T G T G A T

186 questions
38
votes
7 answers

Find longest increasing sequence

You are given a sequence of numbers and you need to find a longest increasing subsequence from the given input(not necessary continuous). I found the link to this(Longest increasing subsequence on Wikipedia) but need more explanation. If anyone…
pappu
  • 423
  • 1
  • 6
  • 7
20
votes
5 answers

Longest common subsequence of 3+ strings

I am trying to find the longest common subsequence of 3 or more strings. The Wikipedia article has a great description of how to do this for 2 strings, but I'm a little unsure of how to extend this to 3 or more strings. There are plenty of libraries…
del
  • 5,560
  • 8
  • 38
  • 44
20
votes
4 answers

longest common substring in R finding non-contiguous matches between the two strings

I have a question regarding finding the longest common substring in R. While searching through a few posts on StackOverflow, I got to know about the qualV package. However, I see that the LCS function in this package actually finds all characters…
IAMTubby
  • 1,457
  • 3
  • 22
  • 39
20
votes
7 answers

How to find Longest Common Substring using C++

I searched online for a C++ Longest Common Substring implementation but failed to find a decent one. I need a LCS algorithm that returns the substring itself, so it's not just LCS. I was wondering, though, about how I can do this between multiple…
David Gomes
  • 5,310
  • 16
  • 52
  • 96
15
votes
3 answers

How do diff/patch work and how safe are they?

Regarding how they work, I was wondering low-level working stuff: What will trigger a merge conflict? Is the context also used by the tools in order to apply the patch? How do they deal with changes that do not actually modify source code behavior?…
cenouro
  • 695
  • 3
  • 15
13
votes
1 answer

Myers diff algorithm vs Hunt–McIlroy algorithm

The longest common subsequence problem is a classic computer science problem, algorithms to solve it are the root of version control systems and wiki engines. Two basic algorithms are the Hunt–McIlroy algorithm which was used to create the original…
0x4a6f4672
  • 24,450
  • 15
  • 96
  • 130
13
votes
3 answers

Find common substrings between two character variables

I have two character variables (names of objects) and I want to extract the largest common substring. a <- c('blahABCfoo', 'blahDEFfoo') b <- c('XXABC-123', 'XXDEF-123') I want the following as a result: [1] "ABC" "DEF" These vectors as input…
Matthew Lundberg
  • 39,899
  • 6
  • 81
  • 105
12
votes
3 answers

Fast(er) algorithm for the Length of the Longest Common Subsequence (LCS)

Problem: Need the Length of the LCS between two strings. The size of the strings is at most 100 characters. The alphabet is the usual DNA one, 4 characters "ACGT". The dynamic approach is not quick enough. My problem is that I am dealing with lot's…
Yiannis
  • 121
  • 1
  • 6
12
votes
3 answers

efficient longest common subsequence algorithm library?

I'm looking for a (space) efficient implementation of an LCS algorithm for use in a C++ program. Inputs are two random access sequences of integers. I'm currently using the dynamic programming approach from the wikipedia page about LCS. However,…
BuschnicK
  • 4,928
  • 6
  • 34
  • 47
10
votes
5 answers

Convert string to palindrome string with minimum insertions

In order to find the minimal number of insertions required to convert a given string(s) to palindrome I find the longest common subsequence of the string(lcs_string) and its reverse. Therefore the number of insertions to be made is length(s) -…
whitepearl
  • 624
  • 2
  • 7
  • 16
9
votes
1 answer

Longest Palindromic Subsequence (dp solution)

Among several dp solutions for this question, an easier solution is to reverse the given string and calculate LCS of the original and reversed string. My question is will this approach yield correct result every time ? For instance , a longest…
Julkar9
  • 839
  • 1
  • 6
  • 15
8
votes
2 answers

Understanding the time complexity of the Longest Common Subsequence Algorithm

I do not understand the O(2^n) complexity that the recursive function for the Longest Common Subsequence algorithm has. Usually, I can tie this notation with the number of basic operations (in this case comparisons) of the algorithm, but this time…
Daniel Catita
  • 187
  • 1
  • 5
8
votes
3 answers

Longest Common Subsequence

Consider 2 sequences X[1..m] and Y[1..n]. The memoization algorithm would compute the LCS in time O(m*n). Is there any better algorithm to find out LCS wrt time? I guess memoization done diagonally can give us O(min(m,n)) time complexity.
tsudot
  • 535
  • 1
  • 6
  • 16
7
votes
3 answers

How can I find the best fit subsequences of a large string?

Say I have one large string and an array of substrings that when joined equal the large string (with small differences). For example (note the subtle differences between the strings): large_str = "hello, this is a long string, that may be made up of…
Josh Voigts
  • 3,830
  • 1
  • 16
  • 39
7
votes
4 answers

Can I use a plaintext diff algorithm for tracking XML changes?

I'm working in Flex/AS3 on (for simplicity) an XML editor. I need to provide undo/redo functionality. Of course, one solution is to store the entire source text with each edit. However, to conserve memory, I'd like to store the diffs instead…
rinogo
  • 7,068
  • 8
  • 54
  • 87
1
2 3
12 13