2

I'm trying to implement prime number generator using Sieve of Eratosthenes algorithm. I do it just to try using recursive iterator merging to implement sifter.

What I do is this:

from itertools import count,islice,groupby
from heapq import merge


def primes3():
    p = 2
    yield p
    sifter = (i*p for i in count(p))
    s = next(sifter)
    for p in count(p+1):
        if p==s: # this p is sieved out
            print('s: {}'.format(s))
            s = next(sifter)
        else:
            yield p # this is prime
            print('p: {}'.format(p))
            sifter = (k for k, g in groupby(merge(sifter,(i*p for i in count(p))))) #add this serie to the sifter: p*p, p*(p+1), p*(p+2), ...


print(list(islice(primes3(),10)))

The output is:

p: 3
s: 4
p: 5
p: 6
p: 7
p: 8
p: 9
p: 10
p: 11
s: 12
[2, 3, 5, 6, 7, 8, 9, 10, 11, 13]

The first number to be sieved out is 4. But the next is 12, not 6 as it should be. Why is that? I checked it with the following code:

>>> sifter = (i*2 for i in count(2))
>>> next(sifter)
4
>>> sifter = (k for k, g in groupby(merge(sifter,(i*3 for i in count(3)))))
>>> print(list(islice(sifter,20)))
[6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 26, 27, 28, 30, 32, 33, 34]

So, as we may see, in test conditions sifter yields the correct results.

Where am I making a mistake? I think it must be something trivial that I just don't see.

Community
  • 1
  • 1
ovgolovin
  • 12,091
  • 4
  • 38
  • 75

1 Answers1

2

I have to agree, this stuff can sometimes be very confusing (but a nice puzzle!).

Turns out that your sifter iterator changes when the value of p changes (by the way, I'm using python 2.6.5 to test this, not python 3, so print syntax is a bit different):

>> p = 2
>> sifter = (i*p for i in count(p))
>> for x in range(3):
>>    print next(sifter)
4
6
8
>>> # now lets see what happens when we change p
>>> p = 3
>>> for x in range(3):
>>>     print next(sifter)
15
18
21

So, the i*p part of the iterator is evaluated with the new of p as soon as p has been updated. An p is indeed updated in your mainloop, which is why you don't get the expected result.

There is an easy way to solve this: create a function to generate the sifter iterator such that the iterator isn't bounded to p:

def gensifter(x):
    return (i*x for i in count(x))

and put this in your code (again, I converted it to python 2.6.5):

def primes3():
    p = 2
    yield p
    sifter = gensifter(p) # <-- changed!
    s = next(sifter)
    for p in count(p+1):
        #print '(testing p = %d\ts = %d)' % (p, s)
        if p==s: # this p is sieved out
            print 's:', s
            s = next(sifter)
        else:
            yield p # this is prime
            print 'p:', p
            sifter = (k for k, g in groupby(merge(sifter,gensifter(p)))) # <-- changed!

Let's see the result now:

>>> print list(islice(primes3(), 10))
p: 3
s: 4
p: 5
s: 6
p: 7
s: 8
s: 9
s: 10
p: 11
s: 12
p: 13
s: 14
s: 15
s: 16
p: 17
s: 18
p: 19
s: 20
s: 21
s: 22
p: 23
s: 24
s: 25
s: 26
s: 27
s: 28
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Primenumbers galore!

catchmeifyoutry
  • 6,793
  • 1
  • 26
  • 25
  • Thank you! Without your help I doubt I would find what's wrong! – ovgolovin Aug 23 '12 at 16:53
  • One more question. Is there is another way that I can create local scope for p apart from creating a special function (as function invocation are somewhat slow, as I know of)? – ovgolovin Aug 23 '12 at 16:54
  • 2
    I found another solution in addition to `gensifter` function: `count(p*p,p)` (as the first parameter is a starting value, and the second is the step. – ovgolovin Aug 23 '12 at 17:02
  • Good question, I don't really know what the most efficient way would be, but using an extra function is not that much of an overhead. The real bottleneck in your algorithm is the recursive nesting of iterators over time, which eventually results in a `RuntimeError: maximum recursion depth exceeded`. I got in no time to prime 2221 before this occurred – catchmeifyoutry Aug 23 '12 at 17:02
  • The approach to generate primes I took turned out to be much less efficient than those that have already been implemented here: http://code.activestate.com/recipes/117119/ Also, my solution results in stackoverflow on quite little values (1000 suffices it to cause overflow): http://ideone.com/bIZzn – ovgolovin Aug 23 '12 at 17:06
  • @ovgolovin right, you have a major flow, [explained here](http://stackoverflow.com/a/8871918/849891). You're in a good company too, many do that. The thing is, you should start adding multiples of 3 to your sifter only when 9 is seen in the input; of 5 - when 25, etc. Check this out!. – Will Ness Aug 23 '12 at 17:44