Time complexity of obtaining the k smallest values overall from n sorted arrays?

Question

I have n arrays. Each of these array can be of infinite length. (length can be variable). All these n arrays are sorted.

now I want to fetch out top k smallest elements out of these n sorted arrays.

For example n=5 and k=10

2 4 6 7 9 

23 45 67 78 99

1 2 6 9 1000 4567 6567 67876

45 56 67 78 89 102 103 104

91 991 9991 99991

now answer should be 1 2 4 6 7 9 23 45 56 67

Would it be O(n*k) i.e O(n^2) in worst case, and O(k) in best case?

One can write many algorithms to solve this problem, they'd be many time complexities among them. Do you have a specific one in mind, or are you asking us to give you (presumably the fastest) one? If you're asking us to give you one, you should show your own attempt at coming up with one. — Bernhard Barker, Jan 10 '14 at 17:28
@Dukeling i am asking for the best case if you could give me. — user609306, Jan 10 '14 at 17:31
You may safely assume the length of each array is at most k. — einpoklum, Aug 12 '16 at 20:41

score 8 · Answer 1 · answered Jan 10 '14 at 17:25

8

It's O(n + k.log(n)) I think.

First build a heap of the smallest element in each array (storing the index of the array too). Building a heap of size n is O(n). Then, repeat k times: take an element from the heap (which is O(log n)), and insert the next smallest element from the array the element you took was from (also O(log n)). Overall, this is O(n + k.log(n)).

answered Jan 10 '14 at 17:25

Paul Hankin

44,768
11
79
97

Actually you should start from the largest, but yeah, that's about right. – Bernhard Barker Jan 10 '14 at 17:26
3

Depends if you believe the question or the example given :) – Paul Hankin Jan 10 '14 at 17:27
2

@VikramBhat nope, please see: http://stackoverflow.com/questions/9755721/build-heap-complexity – kyticka Jan 10 '14 at 17:51
+1. This is the answer. I originally misunderstood your use of `n` here. Good job. – Jim Mischel Jan 10 '14 at 20:07

Jim Mischel · Answer 2 · 2014-01-10T20:13:27.887

The answer provided by Anonymous is the better solution in this case because we know that the individual arrays are sorted.

You can do it with a heap in O(n log k) time, worst case. It will require O(k) extra space.

initialize a MAX heap
for each array
    for each item in the array
        if (heap.count < k)
            heap.insert(item)
        else if (item < heap.peek())
        {
            // item is smaller than the largest item on the heap
            // remove the smallest item and replace with this one
            heap.remove_root()
            heap.insert(item)
        }
        else
        {
            break;  // go to next array
            // see remarks below
        }

Because you know that the arrays are initially sorted, you can include that final optimization I showed. If the item you're looking at is not smaller than the largest item already on the heap, then you know that no other item in the current array will be smaller. So you can skip the rest of the current array.

That's the algorithm to give you the smallest k items. If you want the largest k items, build a MIN heap and change if (item < heap.peek()) to if (item > heap.peek()). In that case, you would get better performance by walking the arrays backwards. That would reduce the number of heap insertions and removals. If you don't walk the arrays backwards, you won't be able to use the optimization I showed.

Another way to do it would be to concatenate all of the items into a single array and use Quickselect. QuickSelect is an O(n) algorithm. Empirical evidence suggests that using a heap is faster when k < .01*n. Otherwise, Quickselect is faster. Your mileage may vary, of course, and having to create a single array from the multiple arrays will add processing and memory overhead to Quickselect.

Time complexity of obtaining the k smallest values overall from n sorted arrays?

2 Answers2