8

I was reading on different sieving algorithms when I stumbled upon a kind of improved version of the Sieve of Eratosthenes called Euler's Sieve. According to Wikipedia there is an implementation of an slightly different version of the idea (called Turner's Sieve) in Haskell.

Now I am trying to understand what exactly the code snippet given does and I think I've got it but now I wanted to translate the code into F# and have really no idea where to start. My main concern is that there doesn't seem to be a function to "substract" two sequences.

Here's the code:

import Data.OrdList (minus)

primes = euler [2..]
euler (p : xs) = p : euler (xs `minus` map (*p) (p : xs))

How would this be implemented in F#? Is it even possible?

Will Ness
  • 62,652
  • 8
  • 86
  • 167
chrischu
  • 2,931
  • 3
  • 25
  • 43
  • 2
    Perhaps also of interest: http://fsharpnews.blogspot.com/2010/02/sieve-of-eratosthenes.html – kvb Feb 24 '10 at 17:38
  • ... Or the "sieve of Atkin": http://blog.hoomla.se/post/The-sieve-of-Atkin-in-F.aspx – Johan Kullbom Feb 24 '10 at 18:24
  • The above 2 links aren't Turner's (or Euler's) Sieve but the Sieve of Eratosthenes. Although they are quite fast up to a limited range of primes as compared to naive implementations of sieves, both aren't much faster than functional versions in spite of being imperative. Johan Kulboon's version isn't even as fast as he thinks it is as he times **primes to 10 million** not the first 10 million primes, and the first link blows up at about the first 30 million primes. Try http://stackoverflow.com/a/17668738/549617 for a functional version about the same speed as the above links. – GordonBGood Nov 01 '13 at 02:50

4 Answers4

9

If you want to calculate things like merges/differences of infinite lists like Haskell does, the LazyList type (found inside the F# power pack) springs to mind.

It makes for very verbose code, like the translation below:

#r "FSharp.PowerPack.dll"

//A lazy stream of numbers, starting from x
let rec numsFrom x = LazyList.consDelayed x (fun () -> numsFrom (x+1))

//subtracts L2 from L1, where L1 and L2 are both sorted(!) lazy lists
let rec lazyDiff L1 L2 =
    match L1,L2 with
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1<x2 ->
            LazyList.consDelayed x1 (fun () -> lazyDiff xs1 L2)
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1=x2 ->
            lazyDiff xs1 L2
        | LazyList.Cons(x1,xs1),LazyList.Cons(x2,xs2) when x1>x2 ->
            lazyDiff L1 xs2
        | _ -> failwith "Should not occur with infinite lists!"

let rec euler = function
    | LazyList.Cons(p,xs) as LL ->
        let remaining = lazyDiff xs (LazyList.map ((*) p) LL)
        LazyList.consDelayed p (fun () -> euler remaining)
    | _ -> failwith "Should be unpossible with infinite lists!"

let primes = euler (numsFrom 2)

with

> primes |> Seq.take 15 |> Seq.toList;;
val it : int list = [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47]

Note that I added two failwith clauses to keep the compiler from complaining about an incomplete match, even if we know that all lists in the calculation are (lazily) infinite.

cfern
  • 5,616
  • 2
  • 22
  • 22
2

First, it must be said that the Euler's sieve for prime numbers is not a "improved version of the Sieve of Eratosthenes" as its performance in every sense is much worse than any version of the Sieve of Eratosthenes: Haskell Wiki on Prime Number algorithms - Euler

Next, it should be said that @cfem's code using LazyList's is a faithful although verbose translation of the version of Euler's sieve that you gave, although it lacks the slight improvement of only sieving odd numbers as per the link above.

It should be noted that there isn't really any point in implementing the Euler sieve, as it is more complex and slower than finding primes by Trial Division Optimized (TDO) as to only doing divisions by found primes up to the square root of the candidate number tested for prime as per: Haskell Wiki on Prime Number algorithms - TDO.

This Trial Division Optimized (TDO) sieve can be implemented in F# using LazyList's (with a reference to FSharp.PowerPack.dll) as:

let primesTDO() =
  let rec oddprimes =
    let rec oddprimesFrom n =
      if oddprimes |> Seq.takeWhile (fun p -> p * p <= n) |> Seq.forall (fun p -> (n % p) <> 0)
      then LazyList.consDelayed n (fun() -> oddprimesFrom (n + 2))
      else oddprimesFrom (n + 2)
    LazyList.consDelayed 3 (fun() -> oddprimesFrom 5)
  LazyList.consDelayed 2 (fun () -> oddprimes)

It can be implemented using sequences in the same form as:

let primesTDOS() =
  let rec oddprimes =
    let rec oddprimesFrom n =
      if oddprimes |> Seq.takeWhile (fun p -> p * p <= n) |> Seq.forall (fun p -> (n % p) <> 0)
      then seq { yield n; yield! oddprimesFrom (n + 2) }
      else oddprimesFrom (n + 2)
    seq { yield 3; yield! (oddprimesFrom 5) } |> Seq.cache
  seq { yield 2; yield! oddprimes }

The sequence version is slightly faster than the LazyList version because it avoids some overhead in calling through since LazyList's are based on cached sequences. Both use an internal object which represents a cached copy of the primes found so far, automatically cached in the case of LazyList's, and by the Seq.cache in the case of sequences. Either can find the first 100,000 primes in about two seconds.

Now, the Euler sieve can have the odd number sieving optimization and be expressed using LazyList's as the following, with one match case eliminated due to knowing that the input list parameters are infinite and the compare match simplified, as well I've added an operator '^' to make the code more readable:

let primesE = //uses LazyList's from referenced FSharp.PowerPack.dll version 4.0.0.1
  let inline (^) a ll = LazyList.cons a (LazyList.delayed ll) //a consd function for readability
  let rec eulers xs =
    //subtracts ys from xs, where xs and ys are both sorted(!) infinite lazy lists
    let rec (-) xs ys =
      let x,xs',ys' = LazyList.head xs,LazyList.tail xs,LazyList.tail ys
      match compare x ( LazyList.head ys) with
        | 0 -> xs' - ys' // x == y
        | 1 -> xs - ys' // x > y
        | _ -> x^(fun()->(xs' - ys)) // must be x < y
    let p = LazyList.head xs
    p^(fun() -> (((LazyList.tail xs) - (LazyList.map ((*) p) xs)) |> eulers))
  let rec everyothernumFrom x = x^(fun() -> (everyothernumFrom (x + 2)))
  2^(fun() -> ((everyothernumFrom 3) |> eulers))

However, it must be noted that the time to calculate the 1899th prime (16381) is about 0.2 and 0.16 seconds for the primesTDO and primesTDOS, respectively, while it is about 2.5 seconds using this primesE for a terrible performance for the Euler sieve at over ten times the time even for this small range. In addition to terrible performance, primeE cannot even calculate primes to 3000 due do even worse memory utilization as it records a rapidly increasing number of deferred execution functions with increasing found primes.

Note that one must be careful in repeated timing of the code as written since the LazyList is a value and has built-in memorization of previously found elements, thus a second identical scan will take close to zero time; for timing purposes it might be better to make the PrimeE a function as in PrimeE() so the work starts from the beginning each time.

In summary, the Euler sieve implemented in any language including F# is only an interesting intellectual exercise and has no practical use as it is much slower and hogs memory much worse than just about every other reasonably optimized sieve.

GordonBGood
  • 3,017
  • 30
  • 38
  • your conclusion might not hold in C/C++, for contiguous array-based (i.e. *bounded*) sieves. – Will Ness Jun 21 '13 at 14:54
  • also, would be interesting to see how the tree-merging Eratosthenes performs in F#, or whether it can be implemented at all ... (or merging on a sliding array segment ...). Eratosthenes is supposed to be faster than TD, maybe it's worth a try. :) – Will Ness Jun 21 '13 at 15:15
  • @Will Ness regarding your first comment, I don't think one can implement the Euler sieve imperatively other than for a bounded range and that becomes the same as the Sieve of Eratosthenes (SoE) because by definition the Euler Sieve cancels composites of all remaining numbers starting at the square of found primes, which will result in the definition of SoE in that the composites will be cancelled by multiples of the found prime spans: [link](http://www.haskell.org/haskellwiki/Prime_numbers#Unbounded_Euler.27s_sieve). The inefficiencies of the Euler Sieve are in the unbound definition. – GordonBGood Jun 28 '13 at 04:50
  • @Will Ness, about implementing the much faster Tree Merging and Map based versions of prime sieves, I am working on it, and find that there isn't anything that Haskell can do that F# can not, even using pure functional code, although perhaps more verbose. The SoE will indeed be faster than TD for larger ranges of Primes, as these versions of TD start to run out of steam at about 100,000 primes and definitely at 1,000,000 primes due to O(n^1.5) time. Unbounded Euler is about O(2) time or worse (due to high memory use) and no true implementation of SoE will be worse than TD with many better. – GordonBGood Jun 28 '13 at 05:03
  • yes, unbounded Euler's is `~ n^2` or worse; btw [here's something similar](http://lambda-the-ultimate.org/node/3127#comment-45669). :) Actually, [there *are* slower SoEr impls](http://stackoverflow.com/q/11718067/849891) than _even suboptimal_ TD ([also this one](http://stackoverflow.com/q/16866849/849891)). As long as it generates the multiples by counting, it's *"true"* in *my* book. :) – Will Ness Jun 28 '13 at 10:30
  • @Will Ness, your first link (the something similar) leads to the Richard Bird version of EoSr (recursive?), which is indeed the EoS algorithm but runs at about O(n^1.5) just as does the optimized TD. I agree that the first of the slow (EoS?) links is a true EoSr as it marks all numbers from "found prime" in increments of prime by advancing up the remaining list one by one, but it is slow for the reasons as pointed out. The Python algorithm is actually bounded but still extremely slow as you point out in your answer there. But we digress, as this question is about the Euler Sieve. – GordonBGood Jun 29 '13 at 00:33
  • right; just a slight correction, that's *not* Richard Bird's version; his is much simpler, it doesn't try to avoid generating duplicate multiples, as the sieve of Er's also doesn't. But that version is similar to Euler's sieve in that they both avoid generation of duplicate multiples - i.e. they generate/remove 45 only once (as 3*15), but Er's and Bird's - twice (as 9+36 and 25+20). :) – Will Ness Jun 29 '13 at 01:25
  • @Will Ness, I stand corrected but now notice the extra code to eliminate the repeat composites, which the Euler Sieve does as well. The problem with that is it adds a lot of "recursiveness" to the algorithm to reduce the actual number of operations performed hardly at all. As expressed in Haskell: – GordonBGood Jun 29 '13 at 02:56
  • continued ...Haskell: primesEU = 2 : eulers [3,5..] where eulers (p:xs) = p : eulers (xs `minus` map (p*) (p:xs)) -- eratos (p:xs) = p : eratos (xs `minus` [p*p, p*p+2*p..]) One can see that one can change the map conversion line in the above code to: – GordonBGood Jun 29 '13 at 05:27
  • continued -- p^(fun() -> (((LazyList.tail xs) - (numFromBy (p*p) (p+p))) |> eulers)) with let rec numFromBy p i = p^(fun() -> (everyothernumFrom (x + i))) to have much better performance using EoS, poor as it may be compared to using some algorithms. – GordonBGood Jun 29 '13 at 05:38
  • @Will Ness, further discussions about functional versions of SoEr (r = recursive, meaning the incremental versions of the Sieve of Erotasthenes) should likely be on another question at [stackoverflow about EoS algorithms in F#](http://stackoverflow.com/questions/4629734/the-sieve-of-eratosthenes-in-f). – GordonBGood Jul 01 '13 at 02:03
  • I used SoEr for "sieve of Eratosthenes", as E could also be Euler. :) btw ``map (p*) (p:xs)`` vs ``[p*p,p*p+2*p..]`` is exactly the difference between Euler's and Eratosthenes'. Thnks for the link. :) – Will Ness Jul 01 '13 at 05:41
2

You can do it with seq. And as you got minus done, euler itself is same as in Haskell.

let rec minus xs ys =
  seq {
    match Seq.isEmpty xs, Seq.isEmpty ys with
    | true,_ | _,true -> yield! xs
    | _ ->
       let x,y = Seq.head xs, Seq.head ys
       let xs',ys' = Seq.skip 1 xs, Seq.skip 1 ys
       match compare x y with
       | 0 -> yield! minus xs' ys'
       | 1 -> yield! minus xs ys'
       | _ -> yield x; yield! minus xs' ys
  }

let rec euler s =
  seq {
    let p = Seq.head s
    yield p
    yield! minus (Seq.skip 1 s) (Seq.map ((*) p) s) |> euler
  }

let primes = Seq.initInfinite ((+) 2) |> euler
ssp
  • 1,692
  • 9
  • 10
  • Unfortunately, recursive sequences using `Seq.skip` are very inefficient. Although this is an elegant solution, it performs much, much worse than cfern's. – kvb Feb 24 '10 at 17:36
  • This seems to be related to the fact that sequences don't have built-in memoization and using Seq.cache as suggested by @toyvo doesn't work because it causes stack overflow, as does having every new sequence end with a "|> Seq.cache" (although much much faster up to that point) due to the recursive functions no longer being tail recursive. The advantage of using LazyList's for this algorithm is that they support built in memoization and thus the tail call recursion still works. – GordonBGood Jun 30 '13 at 04:35
2

With sequences, you get a lot more competitive by persisting the sequence:

let rec euler s =
    seq {
        let s = Seq.cache s
        let p = Seq.head s
        yield p
        yield! minus (Seq.skip 1 s) (Seq.map ((*) p) s) |> euler
    }
t0yv0
  • 4,678
  • 16
  • 35
  • Unfortunately, using Seq.cache as you do means quicker than usual stack overflow as a new cache is built for every recursive call/loop of the euler function, which can't be alleviated since as written the euler function receives a patentially different sequence every call as returned by the minus function. – GordonBGood Jun 21 '13 at 01:54