38

You are given a sequence of numbers and you need to find a longest increasing subsequence from the given input(not necessary continuous).

I found the link to this(Longest increasing subsequence on Wikipedia) but need more explanation.

If anyone could help me understand the O(n log n) implementation, that will be really helpful. If you could explain the algo with an example, that will be really appreciated.

I saw the other posts as well and what I did not understand is: L = 0 for i = 1, 2, ... n: binary search for the largest positive j ≤ L such that X[M[j]] < X[i] (or set j = 0 if no such value exists) above statement, from where to start binary search? how to initialize M[], X[]?

pappu
  • 423
  • 1
  • 6
  • 7
  • 5
    Please explain what exactly you don't understand. Patiently go through the explanation on Wikipedia, and ask about the first thing you don't understand. The explanation there is actually quite readable, I think. – sleske Feb 08 '11 at 21:47
  • 1
    Note that you can edit your question, using the "edit" button below it. Use this to ask a more precise question. Good luck! – sleske Feb 08 '11 at 21:49
  • Here's a javascript implementation of this I've been working on https://gist.github.com/4497653 – wheresrhys Jan 09 '13 at 22:41

7 Answers7

99

A simpler problem is to find the length of the longest increasing subsequence. You can focus on understanding that problem first. The only difference in the algorithm is that it doesn't use the P array.

x is the input of a sequence, so it can be initialized as: x = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]

m keeps track of the best subsequence of each length found so far. The best is the one with the smallest ending value (allowing a wider range of values to be added after it). The length and ending value is the only data needed to be stored for each subsequence.

Each element of m represents a subsequence. For m[j],

  • j is the length of the subsequence.
  • m[j] is the index (in x) of the last element of the subsequence.
  • so, x[m[j]] is the value of the last element of the subsequence.

L is the length of the longest subsequence found so far. The first L values of m are valid, the rest are uninitialized. m can start with the first element being 0, the rest uninitialized. L increases as the algorithm runs, and so does the number of initialized values of m.

Here's an example run. x[i], and m at the end of each iteration is given (but values of the sequence are used instead of indexes).

The search in each iteration is looking for where to place x[i]. It should be as far to the right as possible (to get the longest sequence), and be greater than the value to its left (so it's an increasing sequence).

 0:  m = [0, 0]        - ([0] is a subsequence of length 1.)
 8:  m = [0, 0, 8]     - (8 can be added after [0] to get a sequence of length 2.)
 4:  m = [0, 0, 4]     - (4 is better than 8. This can be added after [0] instead.)
 12: m = [0, 0, 4, 12] - (12 can be added after [...4])
 2:  m = [0, 0, 2, 12] - (2 can be added after [0] instead of 4.)
 10: m = [0, 0, 2, 10]
 6:  m = [0, 0, 2, 6]
 14: m = [0, 0, 2, 6, 14]
 1:  m = [0, 0, 1, 6, 14]
 9:  m = [0, 0, 1, 6, 9]
 5:  m = [0, 0, 1, 5, 9]
 13: m = [0, 0, 1, 5, 9, 13]
 3:  m = [0, 0, 1, 3, 9, 13]
 11: m = [0, 0, 1, 3, 9, 11]
 7:  m = [0, 0, 1, 3, 7, 11]
 15: m = [0, 0, 1, 3, 7, 11, 15]

Now we know there is a subsequence of length 6, ending in 15. The actual values in the subsequence can be found by storing them in the P array during the loop.

Retrieving the best sub-sequence:

P stores the previous element in the longest subsequence (as an index of x), for each number, and is updated as the algorithm advances. For example, when we process 8, we know it comes after 0, so store the fact that 8 is after 0 in P. You can work backwards from the last number like a linked-list to get the whole sequence.

So for each number we know the number that came before it. To find the subsequence ending in 7, we look at P and see that:

7 is after 3
3 is after 1
1 is after 0

So we have the subsequence [0, 1, 3, 7].

The subsequences ending in 7 or 15 share some numbers:

15 is after 11
11 is after 9
9 is after 6
6 is after 2
2 is after 0

So we have the subsequences [0, 2, 6, 9, 11], and [0, 2, 6, 9, 11, 15] (the longest increasing subsequence)

fgb
  • 17,739
  • 2
  • 33
  • 50
  • 2
    fgb !!! wow !!! you are great !!!! I understood how it works.. I really want to thank to you and of course Jeffrey Greenham and sleske for their quick response !!! – pappu Feb 12 '11 at 01:46
  • 18
    This is far better than the wikipedia explanation – GBa Mar 15 '12 at 18:32
  • 1
    I totally agree with @WillDen. This is much better than the current explanation on Wikipedia. – user183037 Apr 04 '12 at 17:43
  • 7
    @pappu: If the answer worked for you, you should consider 'accepting the answer'. That's done by clicking on the arrow below the number of votes on this answer. – user183037 Apr 04 '12 at 17:45
  • @fgb: I love this answer. It'll be perfect if it would add the explanation of where and why a binary search is used. OP asked about it in the last paragraph. – Jungle Hunter Apr 22 '12 at 07:58
  • 4
    And if the OP is reading the binary search is used to get the location where the element being considered can be inserted in M. You can iterate over M in O(N) but best is O(logN) using binary search since M is an increasing sequence. – Jungle Hunter Apr 22 '12 at 08:30
  • I'm confused as to why we need P at all -- as the example suggests, you can get the actual sequence very simply by discarding the first element from M and using the remaining elements to look up characters in X: [X[M[1]], X[M[2]], ..., X[M[L]]]. – j_random_hacker Nov 21 '12 at 16:04
  • 4
    @j_random_hacker There's not enough information in M. For the input [5, 6, 2], the longest sequence is [5, 6], but the best sequence for length 1 is [2], so M = [0, 2, 6] - it doesn't contain the 5 because it was overwritten by a smaller value. – fgb Nov 21 '12 at 18:16
  • @fgb: great post, although I'm slightly confused as to why this is an O(n log n) solution. you run through the elements in the array once, so I see an O(n) solution. – user1246462 Nov 29 '12 at 02:41
  • 1
    @user1246462 There's also the O(log n) binary search for each element to find where to insert it into `m`. – fgb Nov 29 '12 at 03:00
  • @fgb Any ideas how to adapt to find the longest strictly increasing sequence. I can't simply filter the longest increasing subsequence to remove duplicates as e.g. `[3,3,3,1,2]` would give a result of `[3]` where the correct answer is `[1,2]` – wheresrhys Jan 06 '13 at 12:10
  • 1
    @fgb, http://en.wikipedia.org/wiki/Subsequence tells that in a subsequence elements preserve order, but in the above ans (m = [0, 0, 1, 3, 7, 11, 15]) 7 & 11 don't preserve the order. Correct me if I'm wrong. – Jaydeep Solanki Jul 01 '13 at 05:04
  • 1
    @Jaydeep m isn't a single subsequence. Each element represents a different subsequence by storing its ending value. So m is ordered by length. – fgb Jul 01 '13 at 09:36
  • @fgb +1. nice explanation. Can somebody tell me how to print the numbers which are included in the LIS? Thanks. – Trying Aug 02 '13 at 10:08
  • @fgb How is 7 before 11 in the final example? isn't that incorrect comparing it to the given example as 7 is after 11. I guess the answer should end with 7 15 than 7, 11, 15 as that is changing the order in the example. – nmd Dec 09 '13 at 08:41
  • 1
    @NitishMD The numbers in X are only the final values. 7 represents the subsequence [0, 1, 3, 7], and 11 represents [0, 2, 6, 9, 11], so the numbers that come before them are in the correct order. I've added some more information on getting the whole subsequence back at the end. – fgb Dec 09 '13 at 21:22
4

One of the best explanation to this problem is given by MIT site. http://people.csail.mit.edu/bdean/6.046/dp/

I hope it will clear all your doubts.

mridul
  • 95
  • 2
  • 6
1

based on FJB's answer, java implementation:

public class Lis {

private static int[] findLis(int[] arr) {
    int[] is = new int[arr.length];
    int index = 0;
    is[0] = index;

    for (int i = 1; i < arr.length; i++) {
        if (arr[i] < arr[is[index]]) {
            for (int j = 0; j <= index; j++) {
                if (arr[i] < arr[is[j]]) {
                    is[j] = i;
                    break;
                }
            }
        } else if (arr[i] == arr[is[index]]) {

        } else {
            is[++index] = i;
        }
    }

    int[] lis = new int[index + 1];
    lis[index] = arr[is[index]];

    for (int i = index - 1; i >= 0; i--) {
        if (is[i] < is[i + 1]) {
            lis[i] = arr[is[i]];
        } else {
            for (int j = is[i + 1] - 1; j >= 0; j--) {
                if (arr[j] > arr[is[i]] && arr[j] < arr[is[i + 1]]) {
                    lis[i] = arr[j];
                    is[i] = j;
                    break;
                }
            }
        }
    }

    return lis;
}

public static void main(String[] args) {
    int[] arr = new int[] { 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11,
            7, 15 };
    for (int i : findLis(arr)) {
        System.out.print(i + "-");
    }
    System.out.println();

    arr = new int[] { 1, 9, 3, 8, 11, 4, 5, 6, 4, 19, 7, 1, 7 };
    for (int i : findLis(arr)) {
        System.out.print(i + "-");
    }
    System.out.println();
}

}

James Yu
  • 153
  • 1
  • 4
  • 1
    Answer is not useful without explanation. – Pureferret Sep 04 '12 at 08:30
  • There's a bug in this - it won't work for `{ 0, 8, 4, 12, 2, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 }`, and it looks liek it's closer to finding a strictly increasing sequence than an increasing sequence (but cheers for posting though - a great help in working out a javascript implementation) – wheresrhys Jan 06 '13 at 17:33
1

Below is the O(NLogN) longest increasing subsequence implementation:

// search for the index which can be replaced by the X. as the index can't be
//0 or end (because if 0 then replace in the findLIS() and if it's greater than the 
//current maximum the just append)of the array "result" so most of the boundary 
//conditions are not required.
public static int search(int[] result, int p, int r, int x)
{
    if(p > r) return -1;
    int q = (p+r)/2;
    if(result[q] < x && result[q+1]>x)
    {
        return q+1;
    }
    else if(result[q] > x)
    {
        return search(result, p, q, x);
    }
    else
    {
        return search(result, q+1, r, x);
    }
}
    public static int findLIS(int[] a)
    {
        int[] result = new int[a.length];
        result[0] = a[0];
        int index = 0;
        for(int i=1; i<a.length; i++)
        {
            int no = a[i];
            if(no < result[0]) // replacing the min number
            {
                result[0] = no;
            }
            else if(no > result[index])//if the number is bigger then the current big then append
            {
                result[++index] = no;
            }
            else
            {
                int c = search(result, 0, index, no);
                result[c] = no;
            }
        }
        return index+1;
    }
Trying
  • 12,882
  • 8
  • 63
  • 106
0

Based on @fgb 's answer, I implemented the algorithm using c++ to find the longest strictly increasing sub-sequence. Hope this will be somewhat helpful.

M[i] is the index of the last element of the sequence whose length is i, P[i] is the index of the previous element of i in the sequence, which is used to print the whole sequence.

main() is used to run the simple test case: {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15}.

#include <vector>
using std::vector;
int LIS(const vector<int> &v) {
  int size = v.size(), max_len = 1;
  // M[i] is the index of the last element of the sequence whose length is i
  int *M = new int[size];
  // P[i] is the index of the previous element of i in the sequence, which is used to print the whole sequence
  int *P = new int[size];
  M[0] = 0; P[0] = -1;
  for (int i = 1; i < size; ++i) {
    if (v[i] > v[M[max_len - 1]]) {
      M[max_len] = i;
      P[i] = M[max_len - 1];
      ++max_len;
      continue;
    }
    // Find the position to insert i using binary search
    int lo = 0, hi = max_len - 1;
    while (lo <= hi) {
      int mid = lo + ((hi - lo) >> 1);
      if (v[i] < v[M[mid]]) {
        hi = mid - 1;
      } else if (v[i] > v[M[mid]]) {
        lo = mid + 1;
      } else {
        lo = mid;
        break;
      }
    }
    P[i] = P[M[lo]];  // Modify the previous pointer
    M[lo] = i;  
  }
  // Print the whole subsequence
  int i = M[max_len - 1];
  while (i >= 0) {
    printf("%d ", v[i]);
    i = P[i];
  }
  printf("\n");
  delete[] M, delete[] P;
  return max_len;
}
int main(int argc, char* argv[]) {
  int data[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15};
  vector<int> v;
  v.insert(v.end(), data, data + sizeof(data) / sizeof(int));
  LIS(v);
  return 0;
}
Yaguang
  • 596
  • 5
  • 6
0

Late to the party, but here's a JavaScript implementation to go along with the others.. :)

var findLongestSubsequence = function(array) {
  var longestPartialSubsequences = [];
  var longestSubsequenceOverAll = [];

  for (var i = 0; i < array.length; i++) {
    var valueAtI = array[i];
    var subsequenceEndingAtI = [];

    for (var j = 0; j < i; j++) {
      var subsequenceEndingAtJ = longestPartialSubsequences[j];
      var valueAtJ = array[j];

      if (valueAtJ < valueAtI && subsequenceEndingAtJ.length > subsequenceEndingAtI.length) {
        subsequenceEndingAtI = subsequenceEndingAtJ;
      }
    }

    longestPartialSubsequences[i] = subsequenceEndingAtI.concat();
    longestPartialSubsequences[i].push(valueAtI);

    if (longestPartialSubsequences[i].length > longestSubsequenceOverAll.length) {
      longestSubsequenceOverAll = longestPartialSubsequences[i];
    }
  }

  return longestSubsequenceOverAll;
};
bvaughn
  • 12,062
  • 37
  • 43
-7

can we implement it in two 2d array like sequence is

 8    2     4
 0    7     1
 3    7     9

and LIS is 0 -> 2 -> 4 -> 7 -> 8 and what is algorithm for this

Mtech
  • 1