52

Why quicksort(or introsort), or any comparison-based sorting algorithm is more common than radix-sort? Especially for sorting numbers.

Radix-sort is not comparison based, hence may be faster than O(nlogn). In fact, it is O(kn), where k is the number of bits used to represent each item. And the memory overhead is not critical, since you may choose the number of buckets to use, and required memory may be less than mergesort's requirements.

Does it have to do with caching? Or maybe accessing random bytes of integers in the array?

Daniyar
  • 1,520
  • 2
  • 14
  • 23

6 Answers6

28

Two arguments come to my mind:

  1. Quicksort/Introsort is more flexible:

    Quicksort and Introsort work well with all kinds of data. All you need for sorting is the possibility to compare items. This is trivial with numbers but you can sort other data as well.

    Radix sort on the other hand just sorts things by their binary representation. It never compares items against each other.

  2. Radix sort needs more memory.

    All radix sort implementations that I've seen use a secondary buffer to store partial sorting results. This increases the memory requirements of the sorting algorithm. That may not be a problem if you only sort a couple of kilobytes, but if you go into the gigabyte range it makes a huge difference.

    If I remember right a in place radix-sort algorithm exist on paper though.

Nils Pipenbrinck
  • 77,289
  • 24
  • 142
  • 216
  • 9
    The second argument is half-wrong. It is true that the radix sort needs more memory, but the memory required depends on the number of bits you use on each pass(number of buckets). Hence, the memory required may well be less than the requirements of mergesort, for example. – Daniyar Aug 22 '10 at 07:47
  • 1
    First argument is true, but I'm more interested by the fact that the default sorting algorithms for numbers are implemented using quicksort. Especially the implementations in libraries. And the fact that radix sort never compares items against each other is a good thing, since otherwise it's complexity would be limited O(n*logn). – Daniyar Aug 22 '10 at 07:52
  • 2
    It's possible to do a stable two-way in-place partitioning operation in lgN time with constant space. One could thus do an in-place radix sort in constant space with bNlgN time, where 'b' is the number of bits of radix. – supercat May 04 '12 at 15:38
13

One obvious answer is that you can sort arbitrary types using quicksort (ie anything that's comparable), while you are restricted to numbers only with radix. And IMO quicksort is a lot more intuitive.

NullUserException
  • 77,975
  • 25
  • 199
  • 226
  • 24
    IMO Bubble Sort is more intuitive than Quicksort. – Justin Ardini Aug 21 '10 at 22:32
  • 3
    @Justin Indeed, but it's a heck slower. – NullUserException Aug 21 '10 at 22:33
  • 3
    True, but I'm more interested by the fact that the default sorting algorithms for numbers are implemented using quicksort. Especially the implementations in libraries, since the intuitivity is not of high importance, if the implementation of the sort() function is under the hood. – Daniyar Aug 22 '10 at 07:44
12

Radix sort is slower for (most) real world use cases.

One reason is the complexity of the algorithm:

If items are unique, k >= log(n). Even with duplicate items, the set of problems where k < log(n) is small.

Another is the implementation:

The additional memory requirement (which in it self is a disadvantage), affects cache performance negatively.

I think it is safe to say that many libraries, like the standard library, use Quicksort because it performs better in the majority of cases. I don't think that "difficult implementation" or "less intuitive" are major factors.

Plow
  • 3,603
  • 3
  • 16
  • 20
  • 1
    Broadly speaking, I guess there are two reasons to worry about the speed of a sort: Either because you're sorting many small lists, or because you're sorting one gigantic list. If you're sorting small lists of integers, then maybe it's reasonable to assume that there won't be too many duplicates (depending on how they were generated), but if you're sorting 100 billion 32-bit integers then there will necessarily be a lot of duplicates. So one's use case matters. But I agree that *most* programs are much more likely to have to sort small lists frequently than to have to sort a humongous list. – Tim Goodman Oct 18 '16 at 17:33
6

As mentioned on Wikipedia

The topic of the efficiency of radix sort compared to other sorting algorithms is somewhat tricky and subject to quite a lot of misunderstandings. Whether radix sort is equally efficient, less efficient or more efficient than the best comparison-based algorithms depends on the details of the assumptions made. Radix sort efficiency is O(d·n) for n keys which have d or fewer digits. Sometimes d is presented as a constant, which would make radix sort better (for sufficiently large n) than the best comparison-based sorting algorithms, which are all O(n·log(n)) number of comparisons needed. However, in general d cannot be considered a constant. In particular, under the common (but sometimes implicit) assumption that all keys are distinct, then d must be at least of the order of log(n), which gives at best (with densely packed keys) a time complexity O(n·log(n)). That would seem to make radix sort at most equally efficient as the best comparison-based sorts (and worse if keys are much longer than log(n)).

The counter argument is the comparison-based algorithms are measured in number of comparisons, not actual time complexity. Under some assumptions the comparisons will be constant time on average, under others they will not. Comparisons of randomly-generated keys takes constant time on average, as keys differ on the very first bit in half the cases, and differ on the second bit in half of the remaining half, and so on, resulting in an average of two bits that need to be compared. In a sorting algorithm the first comparisons made satisfies the randomness condition, but as the sort progresses the keys compared are clearly not randomly chosen anymore. For example, consider a bottom-up merge sort. The first pass will compare pairs of random keys, but the last pass will compare keys that are very close in the sorting order.

The deciding factor is how the keys are distributed. The best case for radix sort is that they are taken as consecutive bit patterns. This will make the keys as short as they can be, still assuming they are distinct. This makes radix sort O(n·log(n)), but the comparison based sorts will not be as efficient, as the comparisons will not be constant time under this assumption. If we instead assume that the keys are bit patterns of length k·log(n) for a constant k > 1 and base 2 log, and that they are uniformly random, then radix sort will still be O(n·log(n)), but so will the comparison based sorts, as the "extra" length makes even the keys that are consecutive in the sorted result differ enough that comparisons are constant time on average. If keys are longer than O(log(n)), but random, then radix sort will be inferior. There are many other assumptions that can be made as well, and most require careful study to make a correct comparison.

Community
  • 1
  • 1
Abhinav Chauhan
  • 1,668
  • 1
  • 15
  • 18
  • That section has been deleted from Wikipedia, discussion in the talk deemed parts of it to be incorrect. – T - Gott Feb 16 '21 at 22:01
1

Points made in other answers are valid, but as far as the concern of yours mentioned in several comments

...the fact that the default sorting algorithms for numbers are implemented using quicksort. Especially the implementations in libraries...

Quicksort is the 'safe' choice. The potential runtime of a radix sort based on a counting sort is very attractive, yes, but radix sort is subsceptible to performing poorly on malicious/unfortunate datasets. If the number of digits of the keys being sorted approaches the number of keys being sorted, radix sort performs on n^2 along with a non-negligible space complexity, and it tends to have fairly high builtin runtime constants other than that of the number of digits of the keys being sorted.
Mergesort is attractive because its behavior is, in some ways, analagous to a quicksort that picks an optimal pivot at each opportunity (the median). However, it comes with an appreciable space complexity. It is not as subsceptible to malicious/unfortunate data as radix, but also does not offer the attractive possible runtime. A basic quicksort performs very well on most datasets except nearly (or completely) sorted ones, and comes with a tiny space complexity.
Quicksort's vulnerability is easily dealt with by converting it to a randomized quicksort. Radix sort's vulnerability is resolved by placing restrictions on the keys being sorted, which would inherently limit the library's users. Quicksort is more performant than merge on small datasets, and performs reasonably when merge might be faster.
When implementing a library, you want to make it generically useful. Take these examples, a web application and a small device with an extremely restricted microcontroller. Web applications need to deal with malicious data on a regular basis, and also have a wide variety of needs. A library with preconditioned restrictions is less likely to be useful. In the case of the microcontroller, it may be restrictively limited on space and unable to relinquish the slightest bit where one can be saved. Quicksort saves space, and will complete only slower by a constant multiplier IF a situation arises that it is slower.
In sum -
1.) Libraries are often coded for as much generic usability as possible
2.) Good performance all around is acceptable, especially if it is in many cases, the best performance
3.) Space is not always a primary issue, but when it is, it is often explicitly restrictively so

Culex
  • 180
  • 8
-3

Radix sort's efficiency = O(c.n) where c = highest number of digits among the input key set. n = number of keys in input key set.

Quick sort's best case = O(n. log n) where n = number of keys in input key set.

Assume 16 numbers to be sorted with 6 digits each:

Radix sort = 16 * 6 = 96 time units. Quick sort = 16 * 4 = 64 time units.

Lesson: When 'c' is less, Radix does win. When it's high, it loses. Quick sort is independent of number of digits in a key and that makes it somewhat better and more practically acceptable

HebeleHododo
  • 3,574
  • 1
  • 27
  • 37
  • Quicksort requires O(n log n) **comparisons** (also important that this is average case, not worst case). This is important, because it means quicksort is *not* "independent of number of digits in a key". It means you are comparing apples to oranges. If you want to compare like to like it means you have to account for the cost of executing the comparison function. For word size integers it is constant, but that is *not* the general case. – Tim Seguine May 14 '17 at 14:04
  • Would suggest changing Radix's win condition to "When 'c' is less or when 'n' is large"; Radix should win in cases where c < log n. So for instance sorting pixel values on a megapixel camera image should be much faster with Radix sort – Michael Oct 24 '18 at 22:54