2

Bucket sort is a linear-time sort.

Why do we use insertion sort in it? We know that insertion sort takes O(n2) time. Why can we not use any linear sort inside it? As we see, when in each bucket we use insertion sort O(n2). How is bucket sort's total complexity O(n)? Why do we not use a O(nlogn) sort such as merge sort or quick sort?

Yu Hao
  • 111,229
  • 40
  • 211
  • 267
Jawwad Rafiq
  • 187
  • 1
  • 15

2 Answers2

3

Bucket sort with insertion sort in the buckets is only reasonable when we can expect that there will be only a few items in each bucket. When there are only a few items, insertion sort is fine.

In real life, this doesn't happen too often. Usually its when we expect that data to be uniformly distributed because we're sorting into hash order or something like that.

Bucket sort is most commonly used when it's the entire sort -- i.e., the buckets don't need to be sorted at all and you can just append each item into the bucket list.

Sometimes we do a top-down radix sort, which is like bucket sorting and then bucket sorting each bucket. In combination with keeping a bit-mask of non-emtpy buckets, this can be a very fast way to sort when the sort keys are 32-bit integers.

You can also do a bottom-up radix sort by repeatedly bucket sorting on a different range of bits, and just appending items to each bucket. This kind of bucket sort is stable, so when you're bucket sorting by high bits, ties are broken by the previous ordering, which you got by sorting on the lower bits.

Matt Timmermans
  • 36,921
  • 2
  • 27
  • 59
0

insertion sort takes O(n2) time.

That's not the whole story. For partially-sorted arrays, insertion sort performs well, it has linear time complexity.

An array where each entry is not far from its final position is a typical example of partially-sorted array. That's the case here for Bucket sort.

Yu Hao
  • 111,229
  • 40
  • 211
  • 267