Why is the minimalist, example Haskell quicksort not a "true" quicksort?

Question

Haskell's website introduces a very attractive 5-line quicksort function, as seen below.

quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
    where
        lesser = filter (< p) xs
        greater = filter (>= p) xs

They also include a "True quicksort in C".

// To sort array a[] of size n: qsort(a,0,n-1)

void qsort(int a[], int lo, int hi) 
{
  int h, l, p, t;

  if (lo < hi) {
    l = lo;
    h = hi;
    p = a[hi];

    do {
      while ((l < h) && (a[l] <= p)) 
          l = l+1;
      while ((h > l) && (a[h] >= p))
          h = h-1;
      if (l < h) {
          t = a[l];
          a[l] = a[h];
          a[h] = t;
      }
    } while (l < h);

    a[hi] = a[l];
    a[l] = p;

    qsort( a, lo, l-1 );
    qsort( a, l+1, hi );
  }
}

A link below the C version directs to a page that states 'The quicksort quoted in Introduction isn't the "real" quicksort and doesn't scale for longer lists like the c code does.'

Why is the above Haskell function not a true quicksort? How does it fail to scale for longer lists?

You should add a link to the exact page you're talking about. — Staven, Oct 10 '11 at 19:35
@FUZxxl: Haskell lists are immutable so no operation will be in-place whilst using the default datatypes. As to it's speed - it will not necessarily be slower; GHC is an impressive piece of compiler technology and very often haskell solutions using immutable data structures are up to speed with other mutable ones in other languages. — Callum Rogers, Oct 10 '11 at 19:48
Is it actually not qsort? Remember that qsort has `O(N^2)` runtime. — Thomas Eding, Oct 10 '11 at 19:59
The main "problem" is the `(++)` operator, but that I dont think makes it not qsort. It just makes it a non-optimal qsort. — Thomas Eding, Oct 10 '11 at 20:02
It should be noted that the above example is a introductory example of Haskell, and that quicksort is a very bad choice for sorting lists. The sort in Data.List was changed to mergesort back in 2002: http://hackage.haskell.org/packages/archive/base/3.0.3.1/doc/html/src/Data-List.html#sort, there you can also see the previous quick sort implementation. The current implementation is a mergesort that was made in 2009: http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/src/Data-List.html#sort . — HaskellElephant, Oct 11 '11 at 08:32
@HaskellElephant the current sort in Data.List is actually a cross between mergesort and timsort. — Jeremy List, Dec 26 '14 at 05:39
Apart from not being inplace, the ++ operator in haskell is inefficient — lakshayg, Apr 18 '15 at 03:11
So, mutability... But I think it's kind of silly to say this is not "true" Quicksort. Anyway, I'm still kind of puzzled why this would not "scale" for large lists. My wild guess would be there's just a linear loss in speed and an extra O(log(n)) memory usage. Is that completely incorrect? — dividebyzero, Apr 29 '17 at 18:34

score 78 · Accepted Answer · edited Mar 06 '19 at 21:25

78

The true quicksort has two beautiful aspects:

Divide and conquer: break the problem into two smaller problems.
Partition the elements in-place.

The short Haskell example demonstrates (1), but not (2). How (2) is done may not be obvious if you don't already know the technique!

edited Mar 06 '19 at 21:25

Matthias Braun

24,493
16
114
144

answered Oct 10 '11 at 20:26

pat

12,213
1
21
49

19

http://www.informit.com/articles/article.aspx?p=1407357&seqNum=3 -- Andrey Alexandrescu – The_Ghost Oct 11 '11 at 17:32
For a clear description of the partitioning-in-place process see http://interactivepython.org/courselib/static/pythonds/SortSearch/TheQuickSort.html. – pvillela Aug 06 '17 at 00:48

score 57 · Answer 2 · edited Sep 10 '15 at 20:46

57

True inplace quicksort in Haskell:

import qualified Data.Vector.Generic as V 
import qualified Data.Vector.Generic.Mutable as M 

qsort :: (V.Vector v a, Ord a) => v a -> v a
qsort = V.modify go where
    go xs | M.length xs < 2 = return ()
          | otherwise = do
            p <- M.read xs (M.length xs `div` 2)
            j <- M.unstablePartition (< p) xs
            let (l, pr) = M.splitAt j xs 
            k <- M.unstablePartition (== p) pr
            go l; go $ M.drop k pr

edited Sep 10 '15 at 20:46

Alex Lockwood

81,274
37
197
245

answered Oct 20 '11 at 08:25

klapaucius

1,006
6
4

The source for [unstablePartition](http://hackage.haskell.org/packages/archive/vector/latest/doc/html/src/Data-Vector-Generic-Mutable.html#unstablePartition) reveals that it is indeed the same in-place swapping technique (as far as I can tell). – Dan Burton Oct 20 '11 at 17:38
3

This solution is incorrect. `unstablePartition` is very similar to `partition` for `quicksort`, but it doesn't guarantee the element at `m`th position is just `p`. – nymk Mar 27 '13 at 12:34

score 31 · Answer 3 · answered Oct 10 '11 at 23:54

Here is a transliteration of the "true" quicksort C code into Haskell. Brace yourself.

import Control.Monad
import Data.Array.IO
import Data.IORef

qsort :: IOUArray Int Int -> Int -> Int -> IO ()
qsort a lo hi = do
  (h,l,p,t) <- liftM4 (,,,) z z z z

  when (lo < hi) $ do
    l .= lo
    h .= hi
    p .=. (a!hi)

    doWhile (get l .< get h) $ do
      while ((get l .< get h) .&& ((a.!l) .<= get p)) $ do
        modifyIORef l succ
      while ((get h .> get l) .&& ((a.!h) .>= get p)) $ do
        modifyIORef h pred
      b <- get l .< get h
      when b $ do
        t .=. (a.!l)
        lVal <- get l
        hVal <- get h
        writeArray a lVal =<< a!hVal
        writeArray a hVal =<< get t

    lVal <- get l
    writeArray a hi =<< a!lVal
    writeArray a lVal =<< get p

    hi' <- fmap pred (get l)
    qsort a lo hi'
    lo' <- fmap succ (get l)
    qsort a lo' hi

That was fun, wasn't it? I actually cut out this large let at the beginning, as well as the where at the end of the function, defining all of the helpers to make the preceding code somewhat pretty.

  let z :: IO (IORef Int)
      z = newIORef 0
      (.=) = writeIORef
      ref .=. action = do v <- action; ref .= v
      (!) = readArray
      (.!) a ref = readArray a =<< get ref
      get = readIORef
      (.<) = liftM2 (<)
      (.>) = liftM2 (>)
      (.<=) = liftM2 (<=)
      (.>=) = liftM2 (>=)
      (.&&) = liftM2 (&&)
  -- ...
  where doWhile cond foo = do
          foo
          b <- cond
          when b $ doWhile cond foo
        while cond foo = do
          b <- cond
          when b $ foo >> while cond foo

And here, a dumb test to see if it works.

main = do
    a <- (newListArray (0,9) [10,9..1]) :: IO (IOUArray Int Int)
    printArr a
    putStrLn "Sorting..."
    qsort a 0 9
    putStrLn "Sorted."
    printArr a
  where printArr a = mapM_ (\x -> print =<< readArray a x) [0..9]

I don't write imperative code very often in Haskell, so I'm sure there are plenty of ways to clean this code up.

So what?

You will notice that the above code is very, very long. The heart of it is about as long as the C code, though each line is often a bit more verbose. This is because C secretly does a lot of nasty things that you might take for granted. For example, a[l] = a[h];. This accesses the mutable variables l and h, and then accesses the mutable array a, and then mutates the mutable array a. Holy mutation, batman! In Haskell, mutation and accessing mutable variables is explicit. The "fake" qsort is attractive for various reasons, but chief among them is it does not use mutation; this self-imposed restriction makes it much easier to understand at a glance.

That's awesome, in a kind of queasy-making way. I wonder what sort of code GHC produces from something like that? — Ian Ross, Oct 20 '11 at 14:02
@IanRoss: From the impure quicksort? GHC actually produces pretty decent code. — J D, May 26 '16 at 22:11
"The "fake" qsort is attractive for various reasons..." I afraid its performance without in-place manipulation (as already noted) would be awful. And always taking the 1st element as pivot does not help either. — dbaltor, Jul 08 '19 at 11:52

score 25 · Answer 4 · answered Oct 10 '11 at 20:13

25

In my opinion, saying that it's "not a true quicksort" overstates the case. I think it's a valid implementation of the Quicksort algorithm, just not a particularly efficient one.

answered Oct 10 '11 at 20:13

Keith Thompson

230,326
38
368
578

9

I had this argument with someone once: I looked up the actual paper which specified QuickSort, and is indeed in-place. – ivanm Oct 10 '11 at 23:30
2

@ivanm hyperlinks or it didn't happen :) – Dan Burton Oct 11 '11 at 00:07
2

I love how this paper is all imperative and even includes the trick to guarantee logarithmic space use (that many people dont know about) while the (now popular) recursive version in ALGOL is just a footnote. Guess I'll have to look for that other paper now... :) – hugomg Oct 11 '11 at 01:15
7

A "valid" implementation of any algorithm should have the same asymptotic bounds, don't you think? The bastardised Haskell quicksort doesn't preserve any of the memory complexity of the original algorithm. Not even close. That's why it is over 1,000x slower than Sedgewick's genuine Quicksort in C. – J D May 27 '16 at 17:34
The Haskell implementation does have the same asymptotic bounds, the asymptotic bound for quicksort is O(n2), and this is true of both the C and the simple Haskell implementation. The problem is, quicksort is so named because it has a faster wall time than most algorithms in practice, in spite of its poor asymptotic bounds. But, the Haskell implementation doesn't, hence even if it's algorithmically correct, it doesn't achieve the purpose of the algorithm - it's not useful. – James Roper Feb 25 '21 at 01:13

Jason Orendorff · Answer 5 · 2014-07-25T19:14:31.213

Thanks to lazy evaluation, a Haskell program doesn't (almost can't) do what it looks like it does.

Consider this program:

main = putStrLn (show (quicksort [8, 6, 7, 5, 3, 0, 9]))

In an eager language, first quicksort would run, then show, then putStrLn. A function's arguments are computed before that function starts running.

In Haskell, it's the opposite. The function starts running first. The arguments are only computed when the function actually uses them. And a compound argument, like a list, is computed one piece at a time, as each piece of it is used.

So the first thing that happens in this program is that putStrLn starts running.

GHC's implementation of putStrLn works by copying the characters of the argument String into to an output buffer. But when it enters this loop, show has not run yet. Therefore, when it goes to copy the first character from the string, Haskell evaluates the fraction of the show and quicksort calls needed to compute that character. Then putStrLn moves on to the next character. So the execution of all three functions—putStrLn, show, and quicksort— is interleaved. quicksort executes incrementally, leaving a graph of unevaluated thunks as it goes to remember where it left off.

Now this is wildly different from what you might expect if you're familiar with, you know, any other programming language ever. It's not easy to visualize how quicksort actually behaves in Haskell in terms of memory accesses or even the order of comparisons. If you could only observe the behavior, and not the source code, you would not recognize what it's doing as a quicksort.

For example, the C version of quicksort partitions all the data before the first recursive call. In the Haskell version, the first element of the result will be computed (and could even appear on your screen) before the first partition is finished running—indeed before any work at all is done on greater.

P.S. The Haskell code would be more quicksort-like if it did the same number of comparisons as quicksort; the code as written does twice as many comparisons because lesser and greater are specified to be computed independently, doing two linear scans through the list. Of course it's possible in principle for the compiler to be smart enough to eliminate the extra comparisons; or the code could be changed to use Data.List.partition.

P.P.S. The classic example of Haskell algorithms turning out not to behave how you expected is the sieve of Eratosthenes for computing primes.

http://lpaste.net/108190. -- it's doing the "deforested tree sort", there's an [old reddit thread](http://www.reddit.com/r/programming/comments/2h0j2/real_quicksort_in_haskell) about it. cf. http://stackoverflow.com/questions/14786904/haskells-quicksort-what-is-it-really and related. — Will Ness, Jul 26 '14 at 16:46
*looks* Yes, that's a pretty good characterization of what the program actually does. — Jason Orendorff, Jul 28 '14 at 15:56
re the sieve remark, were it written as an equivalent ``primes = unfoldr (\(p:xs)-> Just (p, filter ((> 0).(`rem` p)) xs)) [2..]``, [its most immediate problem](http://stackoverflow.com/a/8871918/849891) would be perhaps clearer. And that's *before* we consider switching to the true sieve algorithm. — Will Ness, Jul 31 '14 at 11:36
I'm confused by your definition of what code "looks like it does". Your code "looks" to me like it calls `putStrLn` which a thunked application of `show` to a thunked application of `quicksort` to a list literal --- and that's exactly what it does! (before optimization --- but compare C code to the optimized assembler sometime!). Maybe you mean "thanks to lazy evaluation, a Haskell program doesn't do what similar-looking code does in other languages"? — Jonathan Cast, Dec 15 '14 at 17:13
@jcast I do think there's a practical difference between C and Haskell in this regard. It's really hard to carry on a pleasant debate about this kind of subject in a comment thread, as much as I'd love to have it out over coffee in real life. Let me know if you're ever in Nashville with an hour to spare! — Jason Orendorff, Dec 20 '14 at 14:50
Does it all mean that `take 1 (quicksort list)` will do exactly as `minimum list` would, but with greater complexity? — Peter, Jun 03 '17 at 15:41

score 17 · Answer 6 · edited Oct 11 '11 at 00:03

17

I think the case this argument tries to make is that the reason why quicksort is commonly used is that it's in-place and fairly cache-friendly as a result. Since you don't have those benefits with Haskell lists, its main raison d'être is gone, and you might as well use merge sort, which guarantees O(n log n), whereas with quicksort you either have to use randomization or complicated partitioning schemes to avoid O(n²) run time in the worst case.

edited Oct 11 '11 at 00:03

Sjoerd Visscher

11,530
2
45
58

answered Oct 10 '11 at 20:26

hammar

134,089
17
290
377

5

And Mergesort is a much more natural sorting algorithm for (immutable) liked lists, where it is freed from the need to work with auxiliary arrays. – hugomg Oct 11 '11 at 01:18

score 13 · Answer 7 · answered Oct 22 '11 at 11:52

I believe that the reason most people say that the pretty Haskell Quicksort isn't a "true" Quicksort is the fact that it isn't in-place - clearly, it can't be when using immutable datatypes. But there is also the objection that it isn't "quick": partly because of the expensive ++, and also because there is a space leak - you hang on to the input list while doing the recursive call on the lesser elements, and in some cases - eg when the list is decreasing - this results in quadratic space usage. (You might say that making it run in linear space is the closest you can get to "in-place" using immutable data.) There are neat solutions to both problems, using accumulating parameters, tupling, and fusion; see S7.6.1 of Richard Bird's Introduction to Functional Programming Using Haskell.

score 4 · Answer 8 · edited Jan 03 '14 at 17:50

It isn't the idea of mutating elements in-place in purely functional settings. The alternative methods in this thread with mutable arrays lost the spirit of purity.

There are at least two steps to optimize the basic version (which is the most expressive version) of quick-sort.

Optimize the concatenation (++), which is a linear operation, by accumulators:

qsort xs = qsort' xs []

qsort' [] r = r
qsort' [x] r = x:r
qsort' (x:xs) r = qpart xs [] [] r where
    qpart [] as bs r = qsort' as (x:qsort' bs r)
    qpart (x':xs') as bs r | x' <= x = qpart xs' (x':as) bs r
                           | x' >  x = qpart xs' as (x':bs) r

Optimize to ternary quick sort (3-way partition, mentioned by Bentley and Sedgewick), to handle duplicated elements:

tsort :: (Ord a) => [a] -> [a]
tsort [] = []
tsort (x:xs) = tsort [a | a<-xs, a<x] ++ x:[b | b<-xs, b==x] ++ tsort [c | c<-xs, c>x]

Combine 2 and 3, refer to Richard Bird's book:

psort xs = concat $ pass xs []

pass [] xss = xss
pass (x:xs) xss = step xs [] [x] [] xss where
    step [] as bs cs xss = pass as (bs:pass cs xss)
    step (x':xs') as bs cs xss | x' <  x = step xs' (x':as) bs cs xss
                               | x' == x = step xs' as (x':bs) cs xss
                               | x' >  x = step xs' as bs (x':cs) xss

Or alternatively if the duplicated elements are not the majority:

    tqsort xs = tqsort' xs []

    tqsort' []     r = r
    tqsort' (x:xs) r = qpart xs [] [x] [] r where
        qpart [] as bs cs r = tqsort' as (bs ++ tqsort' cs r)
        qpart (x':xs') as bs cs r | x' <  x = qpart xs' (x':as) bs cs r
                                  | x' == x = qpart xs' as (x':bs) cs r
                                  | x' >  x = qpart xs' as bs (x':cs) r

Unfortunately, median-of-three can't be implemented with the same effect, for example:

    qsort [] = []
    qsort [x] = [x]
    qsort [x, y] = [min x y, max x y]
    qsort (x:y:z:rest) = qsort (filter (< m) (s:rest)) ++ [m] ++ qsort (filter (>= m) (l:rest)) where
        xs = [x, y, z]
        [s, m, l] = [minimum xs, median xs, maximum xs]

because it still performs poorly for the following 4 cases:

[1, 2, 3, 4, ...., n]
[n, n-1, n-2, ..., 1]
[m-1, m-2, ...3, 2, 1, m+1, m+2, ..., n]
[n, 1, n-1, 2, ... ]

All these 4 cases are well handled by imperative median-of-three approach.

Actually, the most suitable sort algorithm for a purely functional setting is still merge-sort, but not quick-sort.

For detail, please visit my ongoing writing at: https://sites.google.com/site/algoxy/dcsort

There's another optimization you missed: use partition instead of 2 filters to produce the sub-lists (or foldr on a similar inner function to produce 3 sub-lists). — Jeremy List, Dec 26 '14 at 05:52

score 3 · Answer 9 · answered Oct 10 '11 at 20:25

3

There is no clear definition of what is and what isn't a true quicksort.

They are calling it not a true quicksort, because it doesn't sort in-place:

True quicksort in C sorts in-place

answered Oct 10 '11 at 20:25

Piotr Praszmo

16,785
1
51
60

score 0 · Answer 10 · answered Dec 10 '20 at 16:31

It looks like the Haskell version would keep allocating more space for each sub list it divides. So it might run out of memory at scale. Having said that it’s much more elegant. I suppose that’s the trade off you make when you choose functional vs imperative programming.

score -1 · Answer 11 · answered Oct 10 '11 at 19:38

-1

Because taking the first element from the list results in very bad runtime. Use median of 3: first, middle, last.

answered Oct 10 '11 at 19:38

Joshua

34,237
6
59
120

2

Taking the first element is ok if the list is random. – Keith Thompson Oct 10 '11 at 19:54
2

But sorting a sorted or nearly sorted list is common. – Joshua Oct 10 '11 at 19:58
True, taking the first element is ok if the list is random, but to be robust you have to use the median of three. Otherwise you can end up with an n^2 algorithm. – dbeer Oct 10 '11 at 19:59
I don't really support that argument. – fuz Oct 10 '11 at 20:11
8

qsort is average n log n, worst n^2. – Joshua Oct 10 '11 at 20:13
3

Technically, it's not any worse than picking a random value unless the input is already sorted or nearly sorted. Bad pivots are the pivots that are away from the median; the first element is only a bad pivot if it is near the minimum or maximum. – Platinum Azure Oct 10 '11 at 20:19
Also note: Selecting the median of 3 for the pivot would not increase the big-Oh complexity of the Haskell version, which traverses the entire sublist at each level of recursion anyways. – Dan Burton Oct 11 '11 at 00:05
The choice of the pivot is a small detail, the underlying cause is the fact it's not in-place. – sdcvvc Jan 15 '13 at 13:20

score -1 · Answer 12 · answered Oct 20 '11 at 13:30

Ask anybody to write quicksort in Haskell, and you will get essentially the same program--it is obviously quicksort. Here are some advantages and disadvantages:

Pro: It improves on "true" quicksort by being stable, i.e. it preserves sequence order among equal elements.

Pro: It is trivial to generalize to a three-way split (< = >), which avoids quadratic behavior due to some value occurring O(n) times.

Pro: It's easier to read--even if one had to include the definition of filter.

Con: It uses more memory.

Con: It is costly to generalize the pivot choice by further sampling, which could avoid quadratic behavior on certain low-entropy orderings.

Why is the minimalist, example Haskell quicksort not a "true" quicksort?

12 Answers12

Linked

Related