21

The book "Introduction to Algorithms" mentions about the LSD (Least Significant Digit) version of radix sort. However , as others have pointed out here in stackoverflow, a MSD (Most Significant Digit) version also exists. So I want to know the pros and cons of each of these. My guess is that the LSD version has some benefits over the MSD one but I am not sure. Hence the question.

unkulunkulu
  • 10,526
  • 2
  • 27
  • 47
Geek
  • 23,609
  • 39
  • 133
  • 212
  • 1
    This is an invalid question, because both variants exist, but have a bit different properties. – unkulunkulu Aug 13 '12 at 17:59
  • 2
    Ok, but this doesn't change the question. You should put it along the lines of 'what would be the difference between MSD and LSD, pros and cons' etc – unkulunkulu Aug 13 '12 at 18:03
  • 1
    Well, now I believe the question is a good one. – unkulunkulu Aug 13 '12 at 18:21
  • Pros/cons would probably be highly dependent on your problem domain and expected use. For example, radix sorting a list of integers between 1000 and 3000 would probably be better done with the LSD version, since the LSD has a larger set of possible values, allowing the problem to be broken down into more sub-problems of smaller average size than an MSD approach. – twalberg Aug 13 '12 at 18:23
  • @twalberg as I understand Radix sorts, each pass is O(n) so the size of each sub-problem wouldn't matter. Can you expand on that? – Mark Ransom Jan 03 '16 at 03:04
  • @MarkRansom If each pass is truly O(n), then simply from a time-complexity standpoint, you are correct. However, IIRC, given the 3.5 years since I wrote that comment, I was envisioning large data sets that would incur paging/swapping performance overhead, in which case it would be better to divide a problem into 10 O(N/10) sub-problems, thus possibly eliminating the extra cost for not fitting in main memory, instead of 2 O(N/2) sub-problems that may still have capacity issues. Similar comments would apply to the L1/L2/L3 cache performance of smaller sub-problems... – twalberg Jan 03 '16 at 03:56

3 Answers3

10

Taken from the link, might be useful: http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_sorting.aspx (At the very bottom)

The biggest problem with LSD radix sort is that it starts at the digits that make the least difference. If we could start with the most significant digits, the first pass would go a long way toward sorting the entire range, and each pass after that would simply handle the details. The idea of MSD radix sorting is to divide all of the digits with an equal value into their own bucket, then do the same thing with all of the buckets until the array is sorted. Naturally, this suggests a recursive algorithm, but it also means that we can now sort variable length items and we don't have to touch all of the digits to get a sorted array. This makes MSD radix sort considerably faster and more useful.

Roman Dzhabarov
  • 491
  • 3
  • 10
5

As read in the book Algorithms, LSD and MSD are both string array sorting algorithms, based on the so-called key indexed counting rather than on comparisons. Therefore, LSD and MSD have a different running time versus traditional quick sort or merge sort.

As Dzhabarov mentioned, the biggest difference between LSD and MSD is that MSD considers the most significant digit or character first, which by nature sorts strings without iterating through all of the digits in the strings. This is an advantage. However, a recursive implementation of MSD uses more space than LSD.

The table below illustrates parts of the difference among quick sort, LSD and MSD.

algorithm    running time              extra space
quicksort    N(lgN)^2                  1
LSD          NW                        N
MSD          between N and NW          N + WR

where N is the length of array, and W is the length of string, and R is size of radix.

PS: as mentioned in the book, the Java system sort uses a general sorting algorithm with fast string comparison and not LSD or MSD.

Community
  • 1
  • 1
Spectral
  • 3,628
  • 3
  • 24
  • 31
  • Quicksort uses lg(n) extra space for the recursion stack (or equivalent). MSD Radix Sort can use just the recursion stack (or equivalent) extra space as it doesn't need to be stable for correctness, so it can use less space than LSD Radix Sort. – ReneSac Jul 22 '18 at 23:15
4

LSD is faster than MSD when there is a fixed length. MSD is too slow for small files, and it need huge number of recursive calls on small files.

M.J.Watson
  • 430
  • 4
  • 15