When should we use Radix sort?

Question

It seems Radix sort has a very good average case performance, i.e. O(kN): http://en.wikipedia.org/wiki/Radix_sort

Yet it seems like most people are still using Quick Sort - why is this?

Most people use a sort routine provided by their preferred framework without even caring about the algorithm. — Doc Brown, Nov 10 '10 at 17:05
Radix sort is not good with different kind of data, but when you want to sort unsigned int and you want are doing the sort on a multi-core processor like GPU, radix sort is faster. — tintin, Oct 12 '14 at 02:51

score 30 · Answer 1 · answered Nov 10 '10 at 17:06

30

Radix sort is harder to generalize than most other sorting algorithms. It requires fixed size keys, and some standard way of breaking the keys into pieces. Thus it never finds its way into libraries.

answered Nov 10 '10 at 17:06

Mark Ransom

271,357
39
345
578

score 23 · Answer 2 · edited May 11 '21 at 08:23

23

The other answers here fail to give examples of when radix sort is actually used.

An example is when creating a "suffix array" using the skew DC3 algorithm (Kärkkäinen-Sanders-Burkhardt). The algorithm is only linear-time if the sorting algorithm is linear-time, and radix sort is necessary and useful here because the keys are short by construction (3-tuples of integers).

edited May 11 '21 at 08:23

Joachim Sauer

278,207
54
523
586

answered Nov 09 '13 at 10:59

user541686

189,354
112
476
821

1

Completely agree. No mentions of when it's actually used, and no real world benchmarks which compare the two algorithms. – Ivan Š Jan 23 '15 at 09:50

Alexandre C. · Answer 3 · 2010-11-10T18:11:19.773

21

Edited according to your comments:

Radix sort only applies to integers, fixed size strings, floating points and to "less than", "greater than" or "lexicographic order" comparison predicates, whereas comparison sorts can accommodate different orders.
k can be greater than log N.
Quick sort can be done in place, radix sort becomes less efficient.

edited Nov 10 '10 at 18:11

answered Nov 10 '10 at 16:56

Alexandre C.

52,206
8
116
189

"Quick sort can be done in place" - so can binary radix sort, although that increases the likelihood that k is greater than log N. – Steve Jessop Nov 10 '10 at 17:00
2

Your first point is not quite correct - Radix sort can easily be applied to fixed length strings. And the comparison predicate is required no matter which sort algorithm you use. – Mark Ransom Nov 10 '10 at 17:10
2

"Radix sort only applies to integers": Why? I always thought if you sort by the exponent bits and the mantissa bits in the right order, you can use it to sort floating points number, too. And in theory, you *could* use it on strings, only k will almost always be greater than log N then. – Niki Nov 10 '10 at 17:12
@Steve, @Mark, @nikie: taken your comments into account. Thanks. – Alexandre C. Nov 10 '10 at 18:11
Quicksort cannot technically be done in-place -- O(log n) extra space is required to record the position of each pivot. (Usually masked because it's stored in a local variable and recursion is used.) – j_random_hacker Nov 10 '10 at 19:47
2

@j_random_hacker: *Technically* storing an index into an array of length N takes log(N) bits, so I don't think any sorting algorithm can be implemented without extra space ;-) – Niki Nov 10 '10 at 20:37
@nikie: If you count bits instead of log(n)-bit words then yes, log(n) bits will be needed to hold an array index, but the original input is now of size n*log(n) instead of n, and quicksort now requires O(log(n)^2) extra bits of space -- while e.g. heapsort requires only O(log n) extra bits. (But you would normally assume a "word RAM" model in which a machine word can hold n in O(1) space, and divide all quantities through by log(n).) – j_random_hacker Nov 11 '10 at 05:12
2

@j_random_hacker: this is where practicality runs up against theory and both lose. If you're assuming a fixed upper limit on the size of the input array (so that an index can be held in O(1) space), then you break the theoretical model of a limit at infinity, so it's just a question of what you salvage. If you're saying that log(n) is "really" constant, you could just as well say that log^2(n) is "really" constant. In practice, I've written a quicksort (for production) that used, instead of the call stack, a fixed-size array on the stack to store the "todo list". 240 bytes or whatever. – Steve Jessop Nov 11 '10 at 12:17
@Steve Jessop: I hear you. I went on a wild goose chase looking for a definitive answer to this, but the only (murky, unsatisfying) picture that emerged is that when making Big-O time/space statements people usually assume a word RAM model and that a word is at least log(n) bits. Meaning that yes, the machine capacity is implicitly assumed to scale with the input size, which is absurd, though possibly less absurd than other ways to formulate the problem. In any case there's still a log(n) factor difference in the extra space needed by quicksort and heapsort for large-enough n. – j_random_hacker Nov 11 '10 at 22:06

Niki · Answer 4 · 2015-04-09T07:24:49.860

11

Unless you have a huge list or extremely small keys, log(N) is usually smaller than k, it is rarely much higher. So choosing a general-purpose sorting algorithm with O(N log N) average case performance isn't neccesarily worse than using radix sort.

Correction: As @Mehrdad pointed out in the comments, the argument above isn't sound: Either the key size is constant, then radix sort is O(N), or the key size is k, then quicksort is O(k N log N). So in theory, radix sort really has better asymptotic runtime.

In practice, the runtimes will be dominated by terms like:

radix sort: c1 k N
quicksort: c2 k N log(N)

where c1 >> c2, because "extracting" bits out of a longer key is usually an expensive operation involving bit shifts and logical operations (or at least unaligned memory access), while modern CPUs can compare keys with 64, 128 or even 256 bits in one operation. So for many common cases, unless N is gigantic, c1 will be larger than c2 log(N)

edited Apr 09 '15 at 07:24

answered Nov 10 '10 at 17:09

Niki

15,188
5
41
72

3

This is not true for all cases. `k` needn't be a bit count, it could be a byte count for example - if you're sorting 4-byte integers, `N` would need to be smaller than 16 for `log N` to be less than 4. – Mark Ransom Nov 11 '10 at 03:40
1

O(N log N) is a **lie**. There is no such thing. It's O(k N log N) vs. O(k N) -- if you don't believe me, ask yourself how in the world sorting could be independent of element size. – user541686 Apr 09 '15 at 06:41
@Mehrdad: That seems like an argument about semantics. The way I've learned it, the N in O(N log N) is the size of the input, e.g. in bits. Then either the elements have constant size, or there are only N/k elements. – Niki Apr 09 '15 at 06:49
1

@nikie: sure, if you consider k to be constant then that's fine, but then radix sort is O(N), not O(k N). Either way you're not supposed to compare k against log N. – user541686 Apr 09 '15 at 06:50
@Mehrdad: I see your point. Thanks for the correction, I've updated my answer. – Niki Apr 09 '15 at 07:25
Great! Removed my -1 :) I've actually done the analysis before, it's a great exercise and getting nontrivial... if you have the time I suggest you go through it, because there is a crossover that you can indeed determine (at least if you neglect cache effects), but it's not as simple as k vs. log N. – user541686 Apr 09 '15 at 07:29

score 9 · Answer 5 · answered Jul 18 '11 at 11:30

Radix sort takes O(k*n) time. But you have to ask what is K. K is the "number of digits" (a bit simplistic but basically something like that).

So, how many digits do you have? Quite answer, more than log(n) (log using the "digit size" as base) which makes the Radix algorithm O(n log n).

Why is that? If you have less than log(n) digits, then you have less than n possible numbers. Hence you can simply use "count sort" which takes O(n) time (just count how many of each number you have). So I assume you have more than k>log(n) digits...

That's why people don't use Radix sort that much. Although there are cases where it's worthwhile using it, in most cases quick sort is much better.

score 8 · Answer 6 · answered Mar 28 '13 at 03:51

when n > 128, we should use RadixSort

when sort int32s, I choose radix 256, so k = log(256, 2^32) = 4, which is significant smaller than log(2, n)

and in my test, radix sort is 7 times faster than quicksort in the best case.

public class RadixSort {
    private static final int radix=256, shifts[]={8,16,24}, mask=radix-1;
    private final int bar[]=new int[radix];
    private int s[] = new int[65536];//不使用额外的数组t，提高cpu的cache命中率

    public void ensureSort(int len){
        if(s.length < len)
            s = new int[len];
    }   

    public void sort(int[] a){
        int n=a.length;
        ensureSort(n);
        for(int i=0;i<radix;i++)bar[i]=0;
        for(int i=0;i<n;i++)bar[a[i]&mask]++;//bar存放了桶内元素数量
        for(int i=1;i<radix;i++)bar[i]+=bar[i-1];//bar存放了桶内的各个元素在排序结果中的最大下标+1
        for(int i=0;i<n;i++)s[--bar[a[i]&mask]]=a[i];//对桶内元素，在bar中找到下标x=bar[slot]-1, 另s[x]=a[i]（同时--bar[slot]将下标前移，供桶内其它元素使用）

        for(int i=0;i<radix;i++)bar[i]=0;
        for(int i=0;i<n;i++)bar[(s[i]>>8)&mask]++;
        for(int i=1;i<radix;i++)bar[i]+=bar[i-1];
        for(int i=n-1;i>=0;i--)a[--bar[(s[i]>>8)&mask]]=s[i];//同一个桶内的元素，低位已排序，而放入t中时是从t的大下标向小下标放入的，所以应该逆序遍历s[i]来保证原有的顺序不变

        for(int i=0;i<radix;i++)bar[i]=0;
        for(int i=0;i<n;i++)bar[(a[i]>>16)&mask]++;
        for(int i=1;i<radix;i++)bar[i]+=bar[i-1];
        for(int i=n-1;i>=0;i--)s[--bar[(a[i]>>16)&mask]]=a[i];//同一个桶内的元素，低位已排序，而放入t中时是从t的大下标向小下标放入的，所以应该逆序遍历s[i]来保证原有的顺序不变

        for(int i=0;i<radix;i++)bar[i]=0;
        for(int i=0;i<n;i++)bar[(s[i]>>24)&mask]++;
        for(int i=129;i<radix;i++)bar[i]+=bar[i-1];//bar[128~255]是负数，比正数小
        bar[0] += bar[255];
        for(int i=1;i<128;i++)bar[i]+=bar[i-1];     
        for(int i=n-1;i>=0;i--)a[--bar[(s[i]>>24)&mask]]=s[i];//同一个桶内的元素，低位已排序，而放入t中时是从t的大下标向小下标放入的，所以应该逆序遍历s[i]来保证原有的顺序不变      
    }
}

Wouldnt radix-256 need 256 times memory of the size of original array? — huseyin tugrul buyukisik, Feb 16 '14 at 14:55
no, as you can see in the codes, it need only bar[256] and s[original.length], it's additional 1 times memory of the original array — zhuwenbin, Jul 12 '15 at 02:52

score 7 · Answer 7 · 2018-01-04T13:48:21.467

Radix sort isn't a comparison-based sort and can only sort numeric types like integers (including pointer addresses) and floating-point, and it's a bit difficult to portably support floating-point.

It's probably because it has such a narrow range of applicability that many standard libraries choose to omit it. It can't even let you provide your own comparator, since some people might not want to even sort integers directly so much as using the integers as indices to something else to be used as a key for sorting, e.g. Comparison-based sorts allow all that flexibility so it's probably a case of just preferring a generalized solution fitting 99% of people's daily needs instead of going out of the way to cater to that 1%.

That said, in spite of the narrow applicability, in my domain I find more use for radix sorts than introsorts or quicksorts. I'm in that 1% and barely ever work with, say, string keys, but often find use cases for numbers that benefit from being sorted. It's because my codebase revolves around indices to entities and components (entity-component system) as well as things like indexed meshes and there's a whole lot of numeric data.

As a result, radix sort becomes useful for all kinds of things in my case. One common example in my case is eliminating duplicate indices. In that case I don't really need the results to be sorted but often a radix sort can eliminate duplicates faster than the alternatives.

Another is finding, say, a median split for a kd-tree along a given dimension. There radix sorting the floating-point values of the point for a given dimension gives me a median position rapidly in linear time to split the tree node.

Another is depth-sorting higher-level primitives by z for semi-proper alpha transparency if we aren't going to be doing it in a frag shader. That also applies to GUIs and vector graphics software to z-order elements.

Another is cache-friendly sequential access using a list of indices. If the indices are traversed many times, it often improves performance if I radix sort them in advance so that the traversal is done in sequential order instead of random order. The latter could zig-zag back and forth in memory, evicting data from cache lines only to reload the same memory region repeatedly within the same loop. When I radix sort the indices first prior to accessing them repeatedly, that ceases to happen and I can reduce cache misses considerably. This is actually my most common use for radix sorts and it's the key to my ECS being cache-friendly when systems want to access entities with two or more components.

In my case I have a multithreaded radix sort which I use quite often. Some benchmarks:

--------------------------------------------
- test_mt_sort
--------------------------------------------
Sorting 1,000,000 elements 32 times...

mt_radix_sort: {0.234000 secs}
-- small result: [ 22 48 59 77 79 80 84 84 93 98 ]

std::sort: {1.778000 secs}
-- small result: [ 22 48 59 77 79 80 84 84 93 98 ]

qsort: {2.730000 secs}
-- small result: [ 22 48 59 77 79 80 84 84 93 98 ]

I can average something like 6-7 ms to sort a million numbers one time on my dinky hardware which isn't as fast as I would like since 6-7 milliseconds can still be noticed by users sometimes in interactive contexts, but still a whole lot better than 55-85 ms as with the case of C++'s std::sort or C's qsort which would definitely lead to very obvious hiccups in frame rates. I've even heard of people implementing radix sorts using SIMD, though I have no idea how they managed that. I'm not smart enough to come up with such a solution, though even my naive little radix sort does quite well compared to the standard libraries.

Note: Radix sort is a string sorting algorithm, not numeric. Well, okay, it's a *lexicographic* sorting algorithm. "Radix" means "base" (as in base 10 or base 8), and it can sort anything that has digits and places with a predefined order, and that includes strings as long as you choose an order for the characters (e.g. alphabetic, ASCII, unicode codepoint, whatever). You can even think of an English dictionary as a 26 bucket radix sort of English words if you want. I say it's a string sort because as far as computer representations go it's closer to treating the num like a string of digits. — LinearZoetrope, Oct 15 '20 at 06:13
@LinearZoetrope You're right! My bad for my crudeness there. Actually now I'm curious if a radix sort of short strings in a lexicographic fashion might outperform, say, introsort. I really find radix sorts indispensable but can see why many standard libs might omit it given the requirements for more than just a comparator. — , Oct 22 '20 at 08:27

kiltek · Answer 8 · 2013-07-07T20:35:35.630

k = "length of the longest value in Array to be sorted"

n = "length of the array"

O(k*n) = "worst case running"

k * n = n^2 (if k = n)

so when using Radix sort make sure "the longest integer is shorter than the array size" or vice versa. Then you going to beat Quicksort!

The drawback is: Most of the time you cannot assure how big integers become, but if you have a fixed range of numbers radix sort should be the way to go.

score 2 · Answer 9 · edited Jun 20 '20 at 09:12

Here's a link which compares quicksort and radixsort:

Is radix sort faster than quicksort for integer arrays? (yes it is, 2-3x)

Here's another link which analyzes running times of several algorithms:

A Question of Sorts:

Which is faster on the same data; an O(n) sort or an O(nLog(n)) sort?

Answer: It depends. It depends on the amount of data being sorted. It depends on the hardware its being run on, and it depends on the implementation of the algorithms.

score 0 · Answer 10 · answered Oct 19 '16 at 15:10

One example would be when you are sorting a very large set or array of integers. A radix sort and any other types distribution sorts are extremely fast since data elements are mainly being enqueued into an array of queues(max 10 queues for an LSD radix sort) and remapped to a different index location of the same input data to be sorted. There are no nested loops so the algorithm tends to behave more linearly as the number of data input integers to be sorted becomes significantly larger. Unlike other sorting methods, like the extremely inefficient bubbleSort method, the radix sort does not implement comparison operations to sort. Its just a simple process of remapping integers to different index positions until the input is finally sorted. If you would like to test out an LSD radix sort for yourself, I have written one out and stored on github which can be easily tested on an online js ide such as eloquent javascript's coding sandbox. Feel free to play around with it and watch how it behaves with differing numbers of n. I've tested with up to 900,000 unsorted integers with a runtime < 300ms. Here is the link if you wish to play around with it.

https://gist.github.com/StBean/4af58d09021899f14dfa585df6c86df6

Tigran Sargsyan · Answer 11 · 2021-03-20T18:55:50.370

0

in Integer 32bit Sort it will bit quicksort 7-10 times but on 1b elements will take noticeable memory like few gb . So you can use Radix or Counter sort first only if your data n large but original values in data are small or you can use in any huge integer list sorting when you can trade memory for speed

edited Mar 20 '21 at 18:55

answered Mar 20 '21 at 18:49

Tigran Sargsyan

1
2

score -12 · Accepted Answer · answered Nov 18 '10 at 05:38

-12

Quick sort has an average of O(N logN), but it also has a worst case of O(N^2), so even due in most practical cases it wont get to N^2, there is always the risk that the input will be in "bad order" for you. This risk doesn't exist in radix sort. I think this gives a great advantage to radix sort.

answered Nov 18 '10 at 05:38

Guy Nir

504
4
7

5

It's unlikely to be the main advantage. Other comparison-based sorts (like heapsort or mergesort) do not have such a bad worst-case behavior as quicksort's. – Eldritch Conundrum Dec 19 '13 at 19:13
3

the worst case scenario for quicksort isn't really an argument since that's why people commonly use randomized quicksort, i.e. shuffle the input data before actually sorting it. this practically eliminates the chance of having an N^2 running time. – nburk Apr 09 '15 at 06:28
Introsort, which uses quicksort, takes care of this. This isn't an argument. – user541686 Apr 09 '15 at 06:40

When should we use Radix sort?

12 Answers12

Linked