Why is my Sieve of Eratosthenes so slow?

Question

Im solving some problems on Project Euler and had to generate 2 million primes to solve a problem. My implementation of the Sieve of Eratosthenes turned out to be EXTREMELY slow but I don't quite know why. Could someone please explain the major problems with this implementation. I thought it was so pretty, and then I figured out it is utterly terrible :(. I found another implementation of it online and it was so much faster than mine.

def generatePrimes(upperBound):
    numbers = range(2,upperBound+1)
    primes = []

    while numbers:
        prime = numbers[0]
        primes.append(prime)
        numbers = filter((lambda x: x%prime),numbers)
    return primes

EDIT: Thanks for all the answers! Conclusions of this is that it is the filter that is the problem because it goes through every element (instead of only those that are to be marked as not prime) and because it creates a new list every time. Rewrote it with good old for loops and one round of filtering and it works much faster. New code:

def generatePrimes(upperBound):
numbers = range(2,upperBound+1)

for i in xrange(len(numbers)):
    if(numbers[i] != 0):
        for j in xrange(i+numbers[i],len(numbers),numbers[i]):
            numbers[j] = 0

primes = filter(lambda x: x,numbers)
return primes

For one thing,it's not the Sieve of Eratosthenes. It's [trial division](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Trial_division). — Fred Larson, May 29 '15 at 20:49
If you want a sieve http://stackoverflow.com/a/23423821/2141635 — Padraic Cunningham, May 29 '15 at 20:51
And If I have to suggest some improvement I will suggest in filter method because we don't need to access every element of numbers but few — therealprashant, May 29 '15 at 20:52
I think this technically is a sieve, but you're performing large allocations on each iteration with the filter function. Most sieves are optimized to perform only one large array allocation, while yours performs O(n) allocations of sizes starting at len(upperBound). — dustyrockpyle, May 29 '15 at 21:01
@dustyrockpyle: No, there's a significant difference between this and the Sieve of Eratosthenes. The SoE processes each number once when the outer loop reaches it and once for every prime in its prime factorization; this function processes a number once for every prime up to its first prime factor. That turns out to look at every number a lot more times, on average. — user2357112 supports Monica, May 29 '15 at 21:13
@PadraicCunningham it's been answered here many times, my own favorite is http://stackoverflow.com/a/9302299/5987 — Mark Ransom, May 29 '15 at 21:17
One thing that would help is to break out of the loop once the prime you're testing is greater than sqrt(upperBound), because anything remaining is guaranteed to be prime. — Mark Ransom, May 29 '15 at 21:27
Oh yes, of course it is the filter. I'm assigning new lists all the time. I looked at my C++ implementation of this using good old loops and it is real fast. I guess this is what happens when you try to write in a semi functional style just because it looks nicer ;) — Alex, May 29 '15 at 22:38
many good solutions at: [Fastest way to list all primes below N](http://stackoverflow.com/q/2068372/4279) — jfs, May 30 '15 at 17:11

user2357112 supports Monica · Accepted Answer · 2019-03-04T02:39:29.940

The Sieve of Eratosthenes looks like this:

def sieve(n):
    primality_flags = [True]*(n+1)
    primality_flags[0] = primality_flags[1] = False
    primes = []
    for i, flag in enumerate(primality_flags):
        if flag:
            primes.append(i)
            for j in xrange(2*i, n+1, i):
                primality_flags[i] = False
    return primes

It processes each number once when the outer loop reaches it, and once for every prime that divides it. About 1/2 the numbers are divisible by 2, about 1/3 are divisible by 3, and so on; asymptotically speaking, the average number of times each number will be processed is 1 + the sum of the reciprocals of the primes up to n. This sum is about log(log(n)), so the sieve has asymptotic time complexity O(n*log(log(n))), assuming arithmetic is constant time. This is really good.

Your function doesn't do that. Your filter goes over every element in numbers, regardless of whether it's divisible by prime. Each element is processed for every prime up until the first prime that divides it, and processing the prime p removes about 1/p of the elements of numbers. Letting the sequence of primes be p[0], p[1], p[2], etc., and letting the sequence of sizes of numbers be n[0], n[1], n[2], etc., we have the following approximate recurrence:

n[0] = upperBound - 1
n[1] = n[0] * (p[0]-1)/p[0]
n[2] = n[1] * (p[1]-1)/p[1]
...
n[k+1] = n[k] * (p[k]-1)/p[k]

and your algorithm takes time roughly proportional to the sum of the n values up until numbers is empty. I haven't analyzed the behavior of that series, but calculations show the growth is much worse than O(n*log(log(n))). (EDIT: An analysis I didn't come up with when composing this answer says it's O((n/log(n))^2).)

you could use `yield i` to avoid thinking about the time complexity of `primes.append(i)` — jfs, May 30 '15 at 17:05
@J.F.Sebastian: As far as I can see, it wouldn't help. `yield` and `append` have the same amortized time complexity. — user2357112 supports Monica, May 30 '15 at 18:35
it is one of those cases when it is said: *"In theory there is no difference between theory and practice. In practice there is."* — jfs, May 30 '15 at 18:54
it's all in the *square root*. if you analyse your progression, summed; for the two cases, working till an upper limit `m` or the `sqrt(m)`; the [empirical orders of growth](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth) are: for m < 1K...100K: m^1.64...1.80; vs. for m < 1K...100K...1mln: m^1.17...1.27...1.34. — Will Ness, Jun 19 '15 at 14:23

score 2 · Answer 2 · answered May 29 '15 at 21:08

2

Running cProfile shows that most of the time is spent in the filter. Replacing the filter with a list comprehension speeds things up a by about a factor of 2.

numbers = [n for n in numbers if n%prime != 0]

But this doesn't really fix the main problem, which is that you are are recreating the list of numbers with each iteration, and that is slow. The faster implementations http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d just mark the non primes by replacing them with 0 or similar.

answered May 29 '15 at 21:08

James K

3,302
1
24
35

1

if you see `n%prime` in the code; it is not Sieve of Eratosthenes. – jfs May 30 '15 at 17:07
agreed. It is quicker to "generate the composites than to test for them" (quoting wikipedia) – James K May 30 '15 at 18:27
@JamesK yes, *because* you generate each composite just from its prime factors (below its sqrt), but when testing, each candidate is tested by all the primes (below its sqrt), not just its *prime factors*. – Will Ness Dec 31 '18 at 08:46

Why is my Sieve of Eratosthenes so slow?

2 Answers2

Linked