0

so the function would be something like primesearch::Int -> [Int]. For example, primesearch 4 = [2,3,5,7]. Would you need to use the sieve function somehow? or is there another way of doing it?

Will Ness
  • 62,652
  • 8
  • 86
  • 167
user1253510
  • 21
  • 1
  • 3

4 Answers4

3

To generate the first k prime numbers, or the prime numbers <= n, I recommend a sieve. Which kind of sieve depends on how many primes you want. For small numbers of primes, a monolithic Eratosthenes bit sieve is simple and fast. But if you want large numbers of primes, a monolithic sieve would need too much memory, so a segmented sieve is then the better option. For very small numbers of primes (the primes <= 100000, say), a trial division is even easier, but still sufficiently fast.

If you want to earnestly use primes, there are already packages on hackage which provide prime generators, for example arithmoi and NumberSieves. There are others, but as far as I know, all the others are significantly slower.

If it's for homework or similar, which method is the most appropriate depends on what the exercise shall teach.

Daniel Fischer
  • 174,737
  • 16
  • 293
  • 422
2

I'm learning and playing around with Haskell, specifically learning List Comprehension, so I tried to come out with one which represents "all" prime numbers.

I came out with this:

[x | x <- 2:[3,5..], and( map (\y -> 0 /= x `mod` y) (take (x`quot`4-1) [3,5..]))]

And the first N primes could be retrieved using the "take" function:

take N [x | x <- 2:[3,5..], and( map (\y -> 0 /= x `mod` y) (take (x`quot`4-1) [3,5..]))]

which works without infinite loops thank to lazy evaluation of the infinite ranges.

The meaning is: It is a List Comprehension of all "odd numbers plus the number 2"

[x | x <- 2:[3,5..], ... ]

With a condition that for each number in the list "shall always be true"...

and( ... )

...that "it is not 0 the remainder" of the number applied to all...

map (\y -> 0 /= x `mod` y)

...the "first half" which are lesser than the candidate number...

take (x`quot`4-1)

...from the set of "all odd numbers".

 [3,5..]

I'm too beginner of Haskell to evaluate efficiency for real application (probably good only for small numbers), but I find cool that in Haskell it is possible to express a list of primes in this way

1

Another fun article is http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf. It is referenced by qrl's link, but is worth checking out on its own. It provides better explanations than qrl's link, but does not provide nearly as many implementations.

Retief
  • 3,111
  • 15
  • 16
1

Here's the fastest of the simplest, in the low ranges of up to a million primes or so:

{-# OPTIONS_GHC -O2 #-}
import Data.Array.Unboxed

primesToA m = sieve 3 (array (3,m) [(i,odd i) | i<-[3..m]]
                       :: UArray Int Bool)
  where
    sieve p a 
      | p*p > m   = 2 : [i | (i,True) <- assocs a]
      | a!p       = sieve (p+2) $ a//[(i,False) | i <- [p*p, p*p+2*p..m]]
      | otherwise = sieve (p+2) a

(thanks to Daniel Fischer for adding this little thing called explicit type signature here, thus making it work on unboxed arrays). The kicker is, there's a destructive update going on here behind the scenes. (apparently not).

As for the JFP article, it misses the key reason for David Turner's sieve code's inefficiency (the sqrt thing) -- in fact dismisses it as irrelevant -- and offers pretty confusing musings about the sieve, as well as its sound and enlightening math analysis.


edit: this was in response to your title, but in the text it seems you want to generate a set number of primes, not primes up to a given value. The upper limit value is easy to (over-)estimate, so that

nPrimes n | n > 3 =
  let 
    x = fromIntegral n
    m = ceiling $ x*(log x + log (log x))
  in
    take n $ primesToA m

update: efficient list-based genuine sieve of Eratosthenes, using library functions (from the package data-ordlist):

import qualified Data.List.Ordered as O

primes = 2 : 3 : [5, 7..] `O.minus`
                   O.unionAll [[p*p, p*p+2*p..] | p <- tail primes]
Will Ness
  • 62,652
  • 8
  • 86
  • 167
  • @is7s the array-update (//) in tail-recursive call. I deduce this from its *low-ish* empirical run-time cpxty of `~k^1.3` (log of ratio of run times in base of ratio of sizes), and from the fact that with the explicit signature removed, it runs at O(k^1.8) and above (in `k` primes produced), and orders of magnitude slower. (trial division runs at `k^1.45` and good PQ or tree-folding code at `k^1.2`) – Will Ness Mar 07 '12 at 16:44
  • I think you are mistaken. Such an optimisation is not possible since a `UArray` is immutable. The fact that it's relatively fast doesn't mean that it uses destructive updates. – is7s Mar 07 '12 at 16:52
  • @is7s run-time cpxty is my evidence. for a sieve up to `m` producing `k` primes, we have `pi(sqrt m)` steps for each prime `p_i` below it, each generating and removing `n_i = O(m/p_i)` multiples. Generating `n_i` multiples obviously takes `n_i` time. If each is removed at `O(1)` time we get theoretical cpxty of `O(k*log k*log(log k))`, which is empirically below `O(k^1.2)`. – Will Ness Mar 07 '12 at 16:58
  • @is7s or maybe not fully destructive (for it does run above what it should if that were the case) but at least it seems to update the whole list of multiples at once. I interpret the increased cpxty of boxed arrays as each removal step for each prime taking much longer, cpxty-wise as well. I don't remember exactly but memory consumption should be drastically smaller as well for the unboxed, which again I'd interpret as evidence towards destructive update. – Will Ness Mar 07 '12 at 17:03
  • @is7s and it **is** possible because `a` is an accumulator parameter going out of local scope, in tail position, which makes it a perfect candidate for call frame reuse which is what TCO *is* (and not just jumps). – Will Ness Mar 07 '12 at 17:05
  • There's not destructive updates at all, actually there *cannot* be any! `a // xs` returns a whole new array with each recursive call. The speed here comes from the fact that the data is unboxed and that there are no unneeded thunks since `UArray`s are strict. – is7s Mar 07 '12 at 17:12
  • @is7s it don't have to (return a new I mean). It goes out of scope, no-body has a handle on it, it is *uniquely* accessed here, sits in tail position thus a perfect candidate for re-use, and if it doesn't get destructive update I consider it a compiler's deficiency. – Will Ness Mar 07 '12 at 17:16
  • Actually I will consider it a deficiency if it does the destructive update. What will happen if I want to reuse the array later on? – is7s Mar 07 '12 at 17:20
  • @is7s running both up to 100,000, boxed version consumes 18x more memory, and runs 2.4x slower. Up to 200,000 - 17x/3.7x ; 500,000 - 30x space, 7x time. – Will Ness Mar 07 '12 at 17:24
  • @is7s you can't reuse that array as you have no way to refer to it. It is created in the intial call, not named. And if it were created named, then simply it should get copied when first entering the TCO-optimized code. **One** copy, then destructively updated - **reused** fully, as per TCO. – Will Ness Mar 07 '12 at 17:27
  • What you're saying is not valid. You are just assuming it does so when in fact it can't. The benchmarks you've showed in the previous comment are due to the fact that unboxed data is *strict* and is much more memory efficient than it's boxed counterpart. That said, I think you need to see the source code for the (//) function in the `array` package to make sure for yourself. – is7s Mar 07 '12 at 17:34
  • @is7s but that's hypothetical. Here, even if created *named* in `primesToA` prior to being passed into the `sieve` call, it is still perfectly visible that it **isn't** used by anything - `sieve` being a *local* function. I don't demand *full-program optimization* here, just optimization of a function (`primesToA`), which calls its own inner function (`sieve`) to express a simple **loop**, nothing more. Loop's state vars should get **fully reused** in any language. – Will Ness Mar 07 '12 at 17:38
  • @is7s I get what you're saying, and it might not be doing what I thought it did, but it's about optimizations by the GHC. I'll look into the source, sure, thanks for the pointers. :) – Will Ness Mar 07 '12 at 17:41
  • I can assure you that GHC does not do such optimisations. I asked a GHC expert about it before and he confirmed it. There's also a very similar question on SO [here](http://stackoverflow.com/questions/6665821/are-new-vectors-created-even-if-the-old-ones-arent-used-anymore). – is7s Mar 07 '12 at 17:49