0

I'm trying to understand the time complexity of a k-way merge using a heap, and although there is a plethora of literature available on it, I can't find one that breaks down the analysis such that I can understand.

This Wikipedia article claims that "In an O(k) preprocessing step the heap is created using the standard heapify procedure". However, heap insertion is O(log(n)) and find-min is O(1). We start by inserting the first elements of each array into the heap. This takes ∑log(i) time, i = 0 to k - 1 or O(klog(k)) time, that refutes the Wikipedia complexity analysis. (actually O(log(k!)))

We then remove the min element, and insert the next element from the array where the min element originally came from. This takes O(1) + O(log(k)) time, which we repeat n - 1 times. Overall time:

O(klog(k)) + O(n - 1) + O((n - 1)log(k)) ≅ O(klog(k)) + O(n) + O(nlog(k))

Wikipedia claims: "the total running time is O(n log k)". How is that?

Abhijit Sarkar
  • 16,021
  • 13
  • 78
  • 152
  • 2
    Hint 1: Heapify is more efficient than inserting n elements one by one. Hint 2: n ≥ k. (Otherwise it means you have a list with 0 elements.) – Raymond Chen Nov 04 '18 at 03:43
  • @RaymondChen So `2 * O(nlog(k)) + O(n)`, then ignore the linear term? Why? Please elaborate the statement "Heapify is more efficient than inserting n elements one by one", how so? – Abhijit Sarkar Nov 04 '18 at 03:47
  • 1
    https://stackoverflow.com/questions/9755721/how-can-building-a-heap-be-on-time-complexity – Raymond Chen Nov 04 '18 at 03:50
  • You can ignore the linear term because you know that `k ≥ 2`. (Otherwise there is nothing to merge.) – Raymond Chen Nov 04 '18 at 18:25
  • I asked a question on the other thread that you linked to, but it hasn't been answered yet. In order to implement the heap as an array, you'd start from the end of the k arrays, and insert n/2 elements in the heap array from the end. After that, where do you insert the next element, which is the parent of one of the n/2 elements at height h? – Abhijit Sarkar Nov 04 '18 at 19:01
  • That question is about building one heap in O(n), not building k heaps. Don't confuse that other question by talking about k heaps. But if one heap can be built in O(n) then k heaps can be built in O(n_1) + O(n_2) + … + O(n_k), which is O(sum n_i) = O(n). This is not really a programming question any more; it's a question about the theory of computation. Also, this site isn't suited to this type of discussion. You may be better off finding an instructor to help explain it. – Raymond Chen Nov 04 '18 at 19:59
  • "this site isn't suited to this type of discussion." Actually, this site used to be exactly for this type of discussions, before it became geared towards "do my job for me while I go have a beer" type of questions. – Abhijit Sarkar Nov 04 '18 at 20:21
  • I don't think it was ever "teach me how to do complexity analysis step by step." – Raymond Chen Nov 04 '18 at 21:32
  • @RaymondChen Does the most voted complexity analysis question [What is a plain English explanation of “Big O” notation?](https://stackoverflow.com/q/487258/839733) look like rocket science to you? We don't need more patronizing here than we have these days. All questions are valid and intelligent questions if you know how to answer them, except for the ones that show no apparent effort on the OP's part. – Abhijit Sarkar Nov 04 '18 at 21:38
  • I believe I already answered all your questions. – Raymond Chen Nov 05 '18 at 02:42

0 Answers0