0

I'm implementing heapsort using a heap. To do this each value to be sorted is inserted. The insertion method calls heapifyUp() (aka siftUp) so this means each time another value is inserted heapifyUp is called. Is this the most efficient way?

Another idea would be to insert all elements, and then call heapifyUp. I guess heapifyUp would have to be called on each one? Is doing it like this better?

Celeritas
  • 12,953
  • 32
  • 95
  • 174

1 Answers1

2

Inserting each element will build the heap in O(n log n) time. Same thing if you add all the elements to an array and then repeatedly call heapifyUp().

Floyd's Algorithm builds the heap bottom-up in O(n) time. The idea is that you take an array that's in any order and, starting in the middle, sift each item down to its proper place. The algorithm is:

for i = array.length/2 downto 0
{
    siftDown(i)
}

You start in the middle because the last length/2 items in the array are leaves. They can't be sifted down. By working your way from the middle, up, you reduce the number of items that have to be moved.

Example of the difference

The example below, turning an array of 7 items into a heap, shows the difference in the amount of work done.

The heapifyUp() method

[7,5,6,1,2,3,4]  (starting state)

Start at the end and bubble items up.

Move 4 to the proper place

[7,5,4,1,2,3,6]
[4,5,7,1,2,3,6]

Move 3 to its place

[4,5,3,1,2,7,6]
[3,5,4,1,2,7,6]

Move 2 to its place

[3,2,4,1,5,7,6]
[2,3,4,1,5,7,6]

Move 1 to its place

[2,1,4,3,5,7,6]
[1,2,4,3,5,7,6]

The heap is now in order. It took 8 swaps, and you still have to check 4, 2, and 1.

Floyd's algorithm

[7,5,6,1,2,3,4]  (starting state)

Start at the halfway point and sift down. In a 0-based array of 7 items, the halfway point is 3.

Move 1 to its place

[7,5,6,1,2,3,4]  (no change. Remember, we're sifting down)

Move 6 to its place

[7,5,3,1,2,6,4]

Move 5 to its place

[7,1,3,5,2,6,4]
[7,1,3,4,2,6,5]

Move 7 to its place

[1,7,3,5,2,6,4]
[1,2,3,5,7,6,4]

And we're done. It took 5 swaps and there's nothing else to check.

Jim Mischel
  • 122,159
  • 16
  • 161
  • 305
  • I also thought that building the heap once will have all the elements can be done in `O(n)` time (have read about this method where we `siftDown()` only half the elements), but I couldn't reason about the first case. Why is it necessarily `O(n*log(n))`? Can you explain the complexity of first case a little more? – Shubham Aug 18 '16 at 18:17
  • Thanks for the good explanation. This question asks why [Why siftDown is better than siftUp in heapify?](http://stackoverflow.com/questions/13025163/why-siftdown-is-better-than-siftup-in-heapify) but without having known what you describe here, it didn't make sense to me. – Celeritas Aug 19 '16 at 12:28
  • @JimMischel is it still `array.length/2` if the heap starts at index 1 of the array? I thought the answer is yes but when I drew it out size 5 seemed to not work. – Celeritas Aug 21 '16 at 07:32
  • @Celeritas: I made a mistake in my psuedo-code. I had written `for i = array.length/2 downto 1`. It should have been `downto 0`. With a 1-based heap, go to 1. You need to sift the root down, too. My example does that. I've corrected my code. – Jim Mischel Aug 22 '16 at 04:09
  • @JimMischel If `siftDown()` can lead to heap property, why is there a `siftUp()`? Is it more efficient when one node is being added to the bottom? – Celeritas Aug 23 '16 at 09:08
  • @Celeritas: when inserting nodes randomly, approximately half will remain at the leaf level. Half of the remainkng will go up just one level, etc. If you inserted at the top, half would require the maximum numcer of swaps, etc. So, yes, inserting at the bottom is, *in general*, more efficient than inserting at the top. – Jim Mischel Aug 23 '16 at 13:13