3

When Radix sort is used with a stable sort (counting sort, specifically), the best and worst case time costs for Radix sort are usually both given by Theta(d(n+k)), where d is the number of digits for each number to be sorted and k is the number of values each digit can take (usually 10 (because of 0 to 9)).

Despite my research, I have still been unable to find a good explanation as to the difference between what the "best" and "worst" cases are for Radix sort. Can somebody please explain what constitutes a "best" case and a "worst" case in the context of using Radix sort? If so, can you then prove that both of them are in Theta(d(n+k))?

jippyjoe4
  • 742
  • 4
  • 19
  • The number of read and write operations is typically the same for "best" case and "worst" case. The timing can differ due to data pattern. Random data will result in random writes, which isn't cache friendly, while radix sort on already sorted data will do sequential writes, which is cache friendly. – rcgldr Mar 17 '18 at 03:44
  • 1
    @rcgldr That's what I thought; so I'm not really sure what the difference between a best and a worst case is. Although you mention caches, I think I'm supposed to be analyzing the algorithm on a more theoretical level, specifically how the original input list can affect the running time of counting sort. But I'm not sure. – jippyjoe4 Mar 17 '18 at 04:12
  • There seems to be a conflict with the problem statement. If the best and worst case time have the same Theta, and issues like cache are to be ignored, then it would seem they aren't really the best and worst case. The problem statement doesn't clarify how best case and worst case are defined. – rcgldr Mar 17 '18 at 07:31
  • I think we're seeing the same problem here; I don't get it either. But that's what it's asking. – jippyjoe4 Mar 17 '18 at 07:38
  • All cases are both best cases and worst cases. Just like `min(5,5,5,5) == max(5,5,5,5) == 5`. – user202729 Mar 19 '18 at 14:45
  • I could see that if all the elements in input has the same max level then it will take O(n) complexity. For instance in this case [99,12,14,15]. But I'm not sure whether the worst case scenario would be resulted because of inputs like [1, 1000,12,13,777,1000000]. Also, in latter case the array gets sorted in 6 passes. So I'm not completely sure whether it's a worst case. Can someone please confirm. If it's not,please let me know what kind of input would result in a worst case. – srinivas Jul 07 '19 at 15:51

1 Answers1

0

Radix sort sorts the numbers starting from the last digit moving to the front (sorts ones digit, then tens digit, then 100s, etc), and, due to this, it does d sorts (the number of digits). Now when looking at how it sorts each set of digits, this is done via a bucket sort algorithm where every integer in the range (typically 0-9) has its own ‘bucket’, then each number is put into its corresponding bucket based on the value of the current digit (5 with 5, 8 with 8, etc). Although this is commonly stated to be θ(n), it is in fact θ(n+k) where n is there number of elements and k is the number of buckets, which is essentially the range of the data (0-9 is 10 buckets).

The hardest part about bucket sort is that mapping from the list to the bucket needs to be θ(1), which makes mapping n elements θ(n). From there, the most time costly part comes from having to walk down every bucket in order (k buckets) and pulling out the elements within them (n elements). Due to this, the bucket sort algorithm becomes θ(n+k).

Overall, d bucket sorts are being done with work n+k each time, making the overall complexity θ(d(n+k)).