5

Is the following code for generating primes pythonic?

def get_primes(n):
    primes=[False,False]+[True]*(n-1)
    next_p=(i for i,j in enumerate(primes) if j)
    while True:
        p=next(next_p)
        yield p
        primes[p*p::p]=[False]*((n-p*p)//p+1)

Note that next(next_p) will eventually throw a StopIteration error which somehow ends the function get_primes. Is that bad?

Also note that next_p is a generator which iterates over primes, however primes changes during iteration. Is that bad style?

adding the following if statement gets it under 0.25 seconds for the first million primes:

if p*p<=n:
    primes[p*p::p]=[False]*((n-p*p)//p+1)
robert king
  • 14,265
  • 8
  • 84
  • 107
  • 1
    you can save one line if you want using `primes=[False,False]+[True]*(n-1)`, also adding complexity you can optimize to use half a sieve, skip even numbers. see http://stackoverflow.com/a/3035188/464543 – ChessMaster Feb 15 '12 at 03:59
  • 1
    test your code for 0,1,2,3 without the line `if p*p<=n:`... in my machine that line is not needed – ChessMaster Feb 15 '12 at 04:20
  • Thanks, I forgot [False]*(non positive) = [] – robert king Feb 15 '12 at 04:40
  • See [this question](http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python) for many examples in python for fast prime generation – Kimvais Feb 15 '12 at 09:00
  • Thanks Kimvais, I tested mine against the code there and mine is about 30x slower, but I wanted the emphasis on "pythonic" instead of speed. – robert king Feb 15 '12 at 09:05

3 Answers3

3

It's not bad that next(next_p) throws a StopIteration error -- that's what a generator always does when it runs out of items!

Changing the length of a list while iterating over it is a bad idea. But there's nothing wrong with simply changing the contents. Overall, I think this is a rather elegant, if basic, seive.

One small observation: when you "cross out" the multiples of prime numbers, you'll find, if you think about it for a bit, that you don't have to start with p * 2. You can skip ahead to p ** 2.

senderle
  • 125,265
  • 32
  • 201
  • 223
2

There is nothing wrong with the StopIteration, indeed that is the expected behaviour for generators.

I would say this implementation is more pythonic (not necessarily more efficient):

def get_primes(n):
  """Generates prime numbers < n"""
  return (x for x in xrange(2,n) if all(x % i for i in xrange(2,x)))

Pythonic to me means clear, concise, readable, and using the strengths of the language. While I can see your implementation is some sort of sieve, I only know that from prior experience with those kind of algorithms. The implementation above I can read directly as a straight-forward test of divisibility.


Note: there is a minor difference in the interface, your implementation would yield primes <= n whereas my implementation would yield primes < n. Obviously this can be changed easily and trivially (just change n to n+1 in the function body), but I feel it is more pythonic to generate primes up-to-but-not including n to be more consistent with the way, say, range() builtin works.


EDIT: JUST FOR FUN

Here is a least pythonic implementation, and probably pretty inefficient too :)

def get_primes(n):
  import re
  return (x for x in xrange(2,n) if re.match(r'^1?$|^(11+?)\1+$', '1' * x) is None)

I call this the least pythonic because you would be scratching your head for some days to figure out how it works if you haven't seen this trick before!!

wim
  • 266,989
  • 79
  • 484
  • 630
  • 1
    Thanks mate. Yeah this will get really slow once n gets much bigger than 1000, but is easy to read. – robert king Feb 15 '12 at 04:57
  • 1
    Note: I am only trying to answer about _pythonic_, if you are interested in the _fastest_ way in python, this question has been well covered [here](http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python). – wim Feb 15 '12 at 05:02
  • yup, I would write something like this if i quickly needed some small primes, or if i had 0 memory. I have used your idea to submit another answer which is not quite as pythonic but slightly more efficient. – robert king Feb 15 '12 at 08:55
0

Here is another somewhat pythonic solution motivated by @wim, however you can see it is slightly slower than the first method.

def get_primes2(n):
    primes=[]
    for i in range(2,n+1):
        small_primes=(p for p in primes if p*p<=i)
        if all(i%p for p in small_primes):
            yield i
            primes.append(i)

import timeit
print(timeit.timeit("list(get_primes(10**5))",number=5,setup="from __main__ import get_primes")/5.0)
"0.0350940692182945"
print(timeit.timeit("list(get_primes2(10**5))",number=5,setup="from __main__ import get_primes2")/5.0)
"8.226938898658908"
robert king
  • 14,265
  • 8
  • 84
  • 107
  • which imp is `get_primes` referring to here? – wim Feb 16 '12 at 03:22
  • the one in my question. If you add if p*p<=n, you get 0.025. wim, I didn't try your one, i assumed it would be very slow :) – robert king Feb 16 '12 at 04:44
  • 1
    BTW instead of strings you can pass function objects to `timeit`, it's quicker to write because you can omit "setup". Apply arguments using `fnuctools.partial` or a lambda expression. – Kos Jun 28 '12 at 04:29