6

I am reading introduction to algorithms 2nd edition, and there is a question says that we can sort n integers, who are between 0 and n3-1 in linear time. I am thinking of IBM's radix sort approach. I start with the least significant digit, separate numbers with respect to least significant digit, and then sort, then separate with respect to next least significant digit and so on. Each separation takes O(n) time. But i have a doubt, for example, if one of the number consists of n digits, then the algorithm takes O(1*n+2*n+...+n*n)=O(n2) time, right? Can we assure that numbers consist of fewer than n digits, or can anybody give another hint for the question? Thanks

yrazlik
  • 9,023
  • 26
  • 84
  • 145

2 Answers2

3

Radix Sort complexity is O(dn) with d as the number of digits in a number.

The algorithm runs in linear time only when d is constant! In your case d = 3log(n) and your algorithm will run in O(nlog(n)).

I'm honestly not sure how to solve this problem in linear time. Is there any other piece of information regarding the nature of the numbers I'm wondering if there is any other piece of information missing about the nature of numbers...

Marsellus Wallace
  • 15,881
  • 21
  • 79
  • 143
  • thanks for the answer, i found the answer actually, we treat the numbers as 2-digit numbers in radix n, then we sort these 2-digit numbers. Total time is O(n) – yrazlik Mar 10 '13 at 19:19
  • If you break a number in 2-digit 'keys', don't you end up with 'd = 3log(n) / 2'? do you have a link to a detailed analysis of the algorithm? thanks – Marsellus Wallace Mar 10 '13 at 19:28
  • ACtually, i found it from the instructor's manual of the book, and here is exactly what it says: Treat the numbers as 2-digit numbers in radix n. Each digit ranges from 0 to n −1. Sort these 2-digit numbers with radix sort. There are 2 calls to counting sort, each taking Θ(n + n) = Θ(n) time, so that the total time is Θ(n). – yrazlik Mar 10 '13 at 20:10
  • 1
    @bigO: 1. `n**3-1` requires 3 digits in base `n`. 2. You can't expect that manipulating a digit in base `n` is O(1) (constant time) for arbitrary large `n`. – jfs Mar 10 '13 at 21:04
  • I would downvote this (if I could!). You are (in)conveniently mixing two models of computation... If you think input size is n, then you are implicitly assuming word ram model, in which n fits in O(1) words, but then you switch to bit complexity... – Knoothe Mar 11 '13 at 03:08
  • @J.F.Sebastian: What is the model of computation you are using? In the common Word ram model, it is reasonable to assume n fits in O(1) words (otherwise your input size itself is not O(n)) – Knoothe Mar 11 '13 at 03:13
  • @Knoothe: You are right if `n` fits in O(1) words then manipulating a digit in base `n` is O(1) and for `n**3` limit the problem can be solved in O(n) time. – jfs Mar 11 '13 at 15:25
  • @Knoothe congrats on your brand new downvoting privilege! :) What do you mean with "switching to bit complexity"? – Marsellus Wallace Mar 14 '13 at 19:02
  • @Gevorg: I didn't downvote this answer, if that is what you mean :-) Let me ask you this. What is the size of the input? Is Theta(n) or Theta(dn)? What does linear time mean? Theta(n) or Theta(dn)? If you say input size is Theta(n), aren't you ignoring d? – Knoothe Mar 14 '13 at 20:24
  • One does not simply answer 1 question with 5 more! :) RadixSort running time is a function of `n` (elements) and `d` (keys). The parameters `d` and `n` are usually independent with `d` constant and hopefully small. Likewise, graph algorithms complexity for instance depends on `G` and `V`. In this case, `d` is a function of `n` and thus we have a running time of `O(nlog(n))`, what's wrong with this? – Marsellus Wallace Mar 16 '13 at 19:34
  • @Gevorg: Let me try to clarify: You want to take into consideration, the number of bits in the number when computing the radix sort runtime. That is fine, but why do you ignore that when counting the size of the input? The size of the input is Theta(n log n), isn't it (you have n integers of size log n each)? In which case, by the very definition of time complexity, a Theta(n log n) algorithm is _linear_!, because, if S = nlog n is the size of input, then a Theta(S) algorithm is linear. – Knoothe Mar 17 '13 at 06:28
  • @Knoothe nope, I never mentioned bits and I just consider the number of keys in an element. In our case the keys are digits! EG: if number is 834, digits/keys are 8,3,4, and d = 3. Considering a date object as the element for instance, keys would be month,day,year and d=3 as well. RadixSort running time depends on keys and number of elements `O(dn)` (it sorts n elements d times). With `d=3log(n)` (in base 10 given that the max number is `n^3 - 1`) then the running time is `O(nlog(n))` (Big-O and not Theta). – Marsellus Wallace Mar 17 '13 at 19:23
  • @Gevorg: You are still not getting what I am trying to say. If you consider the number of digits as _important_, then you should consistently consider it throughout your run time analysis: i.e. while counting the input size too. I am claiming that, to be consistent (and correct), you must take the input size to be Theta(dn) (or Theta(n log n)). And if that is so, by definition, radix sort _is_ linear. If you are taking input size to be Theta(n), then you are implicitly assuming the WORD RAM model (integers fit in a word), in which case, your d is essentially constant! – Knoothe Mar 21 '13 at 01:16
  • (Note: I am talking about the current problem, where every integer is in the range n^3 -1, by writing in base-n, the number of digits is constant) – Knoothe Mar 21 '13 at 01:41
  • I refer you to: http://stackoverflow.com/questions/4238460/an-array-of-length-n-can-contain-values-1-2-3-n2-is-it-possible-to-sort-in. btw, sorry for so many comments. I am not able to express myself properly (and this commenting systems is kind of painful). :-) – Knoothe Mar 21 '13 at 01:44
  • @Knoothe I'm way too late with this my last comment, my bad. Let's just agree to disagree! :) As a reference to my point of view (considering the number of digits as important), please check the "Sorting in Linear Time" chapter of the "Introduction to Algorithms" book - CLR. Now let's go argue on some other questions! :) – Marsellus Wallace May 15 '13 at 04:42
2

Assuming a word RAM model, and that n fits in O(1) words, there is a linear time algorithm.

Write every number in base n, and do a radix sort (with a stable version of counting sort as the underlying digit sort).

If you want to assume unbounded n, then the size of the input it actually n log n, in which case radix sort again works (in O(n log n) time), and technically speaking, it is still a linear time algorithm! (Of course, I suppose this still assumes arithmetic is O(1)...)

Knoothe
  • 1,208
  • 8
  • 14
  • Is `linearithmic` actually `linear`? – Marsellus Wallace Mar 16 '13 at 19:37
  • Just to follow up since we've been talking for a while now! :) I'm not sure how you got to `O(nlog(n))` but if you accept this as a linear solution then the problem would be trivially solved with a good implementation of QuickSort, MergeSort, or others... – Marsellus Wallace Mar 17 '13 at 19:27
  • @Gevorg: No, comparing integers would be O(log n), and Quicksort etc will be Theta(n (log n)^2). The underlying computational model is important. Most algorithm textbooks use the WORD RAM model (and in practice most people use that without thinking, as it is the closest to a real computer). In which case (i.e in the WORD RAM model), the input size is Theta(n), radix sort is O(n) and quicksort etc are O(n log n). – Knoothe Mar 21 '13 at 01:30
  • (again: I am talking about the current problem, where the integers are < n^3). – Knoothe Mar 21 '13 at 01:41