3

I am a newbie to Haskell programming and have trouble understanding how the below list comprehension expands.

primes = sieve [2..] 
sieve (p:xs) = p : sieve [x | x <-xs, x `mod` p /= 0]

Can someone correct me how the sieve expansion works:

  • As we are pattern matching in sieve, p would associate to 2 and xs from [3..].
  • Next, in the list comprehension x<-3, but then how can we call sieve with just 3 when there is no short-circuit.

Other thing I do not understand is how recursion works here.

I think it would be clear if one could expand the above one step at a time for first few numbers, say until 5.

Mihai Maruseac
  • 19,378
  • 6
  • 52
  • 107
Rohit Sharma
  • 5,024
  • 2
  • 24
  • 38
  • possible duplicate of [Help explain this chunk of haskell code that outputs a stream of primes](http://stackoverflow.com/questions/1764163/help-explain-this-chunk-of-haskell-code-that-outputs-a-stream-of-primes) – Mihai Maruseac Dec 05 '14 at 17:19
  • No it is not duplicate, please read my question again – Rohit Sharma Dec 06 '14 at 01:05
  • It is if you inline everything in one function and translate the guard in list comprehension to `filter`. – Mihai Maruseac Dec 06 '14 at 03:20

3 Answers3

8

Let's do some equational reasoning.

primes = sieve [2..]
sieve (p:xs) = p : sieve [x | x <- xs, x `mod` p /= 0]

The [2..] is sintactic sugar for [2, 3, 4, 5, ...] so

primes = sieve [2, 3, 4, 5, 6, ...]

Inline sieve once:

primes = 2 : sieve [x | x <- [3, 4, 5, 6, 7, ...], x `mod` 2 /= 0]

First, x gets value 3 which passes the mod 2 filter

primes = 2 : sieve (3 : [x | x <- [4, 5, 6, 7, ...], x `mod` 2 /= 0])

Inline sieve again (I renamed x to y to prevent confusions)

primes = 2 : 3 : sieve [y | y <- [x | x <- [4, 5, 6, 7, ...], x `mod` 2 /= 0], 
                            y `mod` 3 /= 0]

Now x = 4 fails the mod 2 filter but x = 5 passes it. So

primes = 2 : 3 : sieve [y | y <- 5 : [x | x <- [6, 7, 8, ...], x `mod` 2 /= 0], 
                            y `mod` 3 /= 0]

This y = 5 also passes the mod 3 filter so now we have

primes = 2 : 3 : sieve (5 : [y | y <- [x | x <- [6, 7, 8, ...], x `mod` 2 /= 0], 
                                 y `mod` 3 /= 0])

Expanding sieve one more time (z instead of y) gets us to

primes = 2 : 3 : 5 : sieve [z | z <- [y | y <- [x | x <- [6, 7, 8, ...], 
                                                    x `mod` 2 /= 0], 
                                          y `mod` 3 /= 0], 
                                z `mod` 5 /= 0]

And the expansion continues on the same way.

Mihai Maruseac
  • 19,378
  • 6
  • 52
  • 107
3

Here is an operational description of what sieve does.

To compute sieve (x:xs):

  1. Emit the leading element x.
  2. From the tail xs, let ys be the list xs with all of the multiples of x removed.
  3. To generate the next element, recursively call sieve on ys defined in step 2.

Here is how the first couple of terms are computed:

sieve [2..]
  = sieve (2:[3..])              -- x = 2, xs = [3..]
  = 2 : sieve ys
      where ys = [3..] with all of the multiples of 2 removed
               = [3,5,7,9,...]
  = 2 : sieve [3,5,7,9,...]

and then:

sieve [3,5,7,9,...]              -- x = 3, xs = [5,7,9,11,...]
  = 3 : sieve ys
      where ys = [5,7,9,11,13,15,17,...] with all of the multiples of 3 removed
               = [5,7,  11,13,   17,...]
  = 3 : sieve [5,7,11,13,17,...]

and then:

sieve [5,7,11,13,17,...]         -- x = 5, xs = [7,11,13,17..]
  = 5 : sieve ys
      where ys = [7, 11,13, 17,19,...]  with all of the multiples of 5 removed
               = [7, 11,13, 17,19,...]  (the first one will be 25, then 35,...)
  = 5 : sieve [7,11,13,17,19,...]

etc.

Will Ness
  • 62,652
  • 8
  • 86
  • 167
ErikR
  • 50,049
  • 6
  • 66
  • 121
3

Using an auxiliary function

transform (p:xs) = [x | x <- xs, mod x p /= 0]
                 = filter (\x-> mod x p /= 0) xs  -- remove all multiples of p
                 = xs >>= noMult p                -- feed xs through a tester
-- where
noMult p x = [x | rem x p > 0]      -- keep x if not multiple of p

we can rewrite the sieve function as

       .=================================================+
       |                                                 |
       |  sieve input =                                  |
       |                  .===========================+  |
       |                  |                           |  |
 <~~~~~~~~~~ head input : | sieve (transform input )  |  |
       |                  |                           |  |
       |                  \___________________________|  |
       |                                                 |
       \_________________________________________________|

In an imperative pseudocode, we could write it as

sieve input =
    while (True) :
        yield (head input)           -- wait until it's yanked, and then
        input := transform input     -- advance and loop

This pattern of repeated applications is known as iteration:

iterate f x = loop x
  where
    loop x = x : loop (f x)   -- [x, f x, f (f x), f (f (f x)), ...]

So that

sieve xs = map head ( iterate transform xs )

Naturally, head element of each transformed sequence on each step will be a prime, since we've removed all the multiples of the preceding primes on previous steps.

Haskell is lazy, so transformations won't be done in full on each step, far from it -- only as much of the work will be done as needed. That means producing only the first element, and "making a notice" to perform the transformation further when asked:

<--  2 ---<    [2..]
<--  3 ---<    [3..] >>= noMult 2 
<--  5 ---<   ([4..] >>= noMult 2) >>= noMult 3 
<--  7 ---<  (([6..] >>= noMult 2) >>= noMult 3) >>= noMult 5
<-- 11 ---< ((([8..] >>= noMult 2) >>= noMult 3) >>= noMult 5) >>= noMult 7
            ...............

Incidentally, this should give us an idea: 3 needn't really be tested by 2; 4..8 needn't really be tested by 3, let alone 5 or 7; 9..24 shouldn't really be tested by 5; etc. What we want is the following:

<-- 2
<-- 3         --<  2(4),    [3..]
<-- 5,7       --<  3(9),    [4..]  >>= noMult 2 
<-- 11,...,23 --<  5(25),  ([9..]  >>= noMult 2) >>= noMult 3 
<-- 29,...,47 --<  7(49), (([25..] >>= noMult 2) >>= noMult 3) >>= noMult 5
                   ......................

i.e. we want the creation of each >>= noMult p filter postponed until p*p is reached in the input.

Will Ness
  • 62,652
  • 8
  • 86
  • 167