Can I optimize my prime number summing function with generators?

Question

Problem: While the following code works, it takes far too long to be of any use in finding the sum of all primes below the number 2,000,000.

Past attempts: I've attempted implementing while loops, counters, and a number of other tools to modify the code, but they end up modifying my results as well. Previously, I'd simply added the numbers to an existing variable instead of appending them to a list, but the result was the same.

I believe that a generator function/expression will solve the problem, but I've had trouble implementing either the function, expression, or both.

# Prime number determiner 
def is_prime(x):
    for i in range(2, x-1):
        if x % i == 0:
            return False
    else:
        return True

# Function summing all prime numbers between 2 and 2,000,000
for i in range(2, 2000000):
    if is_prime(i) is True:
        primes.append(i)
results = sum(primes)
print(primes)

Previous attempt at generator expressions/functions:

#Generator version of above
def is_prime_gen(x):
     yield (i for i in range(2, x-1) if x % i == 0)
sum_prime += (j for j in range(2, 2000000) if is_prime_gen(j))

Expected results: I don't need the result to process super-fast, but I would like it to process within a minute or two.

Bonus: For anyone responding, it would be helpful to me if you could also explain how you came to your conclusions (and while "experience" is a valid explanation, it isn't a helpful).

I don't know that your question is a duplicate exactly, but there's a great list of prime generating functions [in this previous question](https://stackoverflow.com/q/2068372/1405065). — Blckknght, Aug 30 '19 at 00:36
@Blckknght - ya, I saw that one before I posted my question. I think it's helpful when it comes to optimization in a general sense, but I didn't see anywhere in the post that utilizes generators in the solution. Maybe it's a matter of my misunderstanding of generators, but this seems to be a perfect situation that could benefit from their implementation. — alofgran, Aug 30 '19 at 00:50
I think you need to rephrase your question if you want emphasis on generator functions for some reason. Your question puts emphasis on performance (since your current code is too slow), and the first code block you show doesn't use a generator function at all. — Blckknght, Aug 30 '19 at 01:16
Wanting to get the performance down to a couple of minutes is not very ambitious. I don't see why this should take any more than a few seconds by using a sieve-based method. Generators don't seem terribly relevant. The lack of generators is not the reason that your code is lagging. — John Coleman, Aug 30 '19 at 01:32
@John Coleman - if you actually ran the code above on a rig like mine, then you would see that reducing a code that takes 30-40 minutes to a single minute is worth while. It would also be helpful if you actually provided some support instead of criticism only. I've looked at the sieve-based method, but that's not my question. If generators aren't relevant, read my other comments and you'll learn more about my understanding of generators. If I misunderstand them, then correct me. Leaving things as 'the lack of generators is not the reason...' is barely worth stating (esp. at this point). — alofgran, Aug 30 '19 at 05:21

score 2 · Accepted Answer · answered Aug 30 '19 at 06:44

Your focus on making a generator function is an example of the XY problem. You've decided that the solution to your code's performance problems is to use a generator, but that's not actually correct. When you get non-generator-related answers, you think they're not helpful, and the rest of us are just a bit confused about why generators are relevant in the first place.

Lets examine why you're having performance issues. The main problem is that your code takes O(n) time to determine if each number n is prime. You have to do this for each numbers from two up to whatever your limit is. This means the whole algorithm takes O(N**2) time where N is the largest number to check (e.g. two million). For a large N your code will take a very long time.

Using a generator for your primes won't, by itself, improve that. It will still take just as long to figure out if each candidate value is prime, and you still need to check all the same numbers if you stick with your current algorithm. At best it would be as good as adding the prime numbers immediately to a running sum, rather than putting them into a list and summing at the end. That is, it could save you memory, but not time.

The real way to greatly improve your performance is to use a smarter algorithm that does less work. There are a bunch of good ways to find primes in less time. Some can be implemented as generators, but in this situation, computing all the primes at once and using extra memory in exchange for better performance is probably a reasonable trade-off.

That's because your computer can hold billions of integers in memory at one time. Numbers less than a few billion use about 28 byes in Python, so two million of them takes around 56 MB, plus about 18 MB more for the list data structure. So you can do a memory intensive algorithm without needing to worry about your memory usage.

Here's a very fast implementation of the Sieve of Eratosthenes algorithm for computing all of the primes less than N in pure Python. The implementation was originally by Robert Williams Hanks in this answer, but this version was tweaked a bit by Bruno Astrolino to work a little more efficiently in Python 3.6+ in this answer.

from itertools import compress

def rwh_primes1v1(n):
    """ Returns  a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

You would want to run sum(rwh_primes1v1(2_000_000)). On my computer that takes about 30 milliseconds, compared to your code which takes 30 seconds (1000 times longer) for N=100_000 (a bound twenty times less). I wasn't willing to wait for the three hours or so the inefficient algorithm would need for N=2_000_000.

Note that if you really do want a generator that yields the primes for some other reason, there are some good implementations of infinite prime generators in the answers to this question. It's unlikely that using any of them for your summing problem is going to result in faster code than what I provided above until you get to such a large N that you can't fit the whole sieve in memory at once (and only some of the generators will help with that, some have significant memory overheads themselves).

I think that this should be the accepted answer. It doesn't simply ignore the generator part of the question but instead explains how it is somewhat misplaced, and then gives a solution which blows the competitors (including my hybrid sieve/generator) out of the water. — John Coleman, Aug 30 '19 at 11:15
@Blckknght I can see that I was probably heading into 'XY' territory. Until your answer though, I wasn't seeing answers citing why a generator wouldn't be as beneficial as I thought it would be (with respect to the documentation that seemed to show otherwise). I appreciate your thorough explanation and agree with John Coleman. — alofgran, Aug 30 '19 at 17:08

ToughMind · Answer 2 · 2019-08-30T01:07:04.163

I think the key problem is how to find all the prime number fast and correct. And there are many answers about it. I find one as belows:

def isprime(n):
    """Returns True if n is prime."""
    if n == 2:
        return True
    if n == 3:
        return True
    if n % 2 == 0:
        return False
    if n % 3 == 0:
        return False

    i = 5
    w = 2

    while i * i <= n:
        if n % i == 0:
            return False

        i += w
        w = 6 - w

    return True

sum = 0
for n in range(2, 20000):
    if isprime(n):
        sum += n

When in range(2, 10000), the time cost is:

0.0043639220002660295  # this answer
0.25401434600007633  # your answer

When comes to (2, 100000), the time cost is :

0.1730230279999887  # this answer
19.639503588000025  # your answer

thanks for the suggestion. While I agree that there are many alternative solutions, my question is specifically with regard to the implementation of generators. — alofgran, Aug 30 '19 at 01:05

DarkLeader · Answer 3 · 2019-08-30T01:24:37.970

1

import time
prime = (i for i in range(2, 2000000) if is_prime(i))

def is_prime(num):
    if num == 2:
        return True
    if num == 3:
        return True
    if num % 2 == 0:
        return False
    if num % 3 == 0:
        return False

    i = 5
    w = 2

    while i * i <= num:
        if num % i == 0:
            return False

        i += w
        w = 6 - w

    return True

print(sum(prime))
print(time.perf_counter())

i'm no expert, but i think this should work and be quite simple to understand. i used the improved function that ToughMind shared. takes my system 15.5 secs to calculate the sum

edited Aug 30 '19 at 01:24

answered Aug 30 '19 at 01:19

DarkLeader

533
5
13

That makes sense, and this solution does speed things up. Any ideas on how to generator-ify the rest of the code, in the form of what I have in my question above but, unlike my original code, in a way that actually works? – alofgran Aug 30 '19 at 01:31

cdlane · Answer 4 · 2019-08-31T06:09:12.393

The answer to this query comes down to what you mean by optimize. A generator can be used to optimize space usage. Where you waste space is in this logic in your main code:

primes.append(i)

Your is_prime() function doesn't waste space. Generators only save time when the sequence computation can be aborted early, rather than completely created and then partially used. That isn't the case here.

Here's a simple rework that cleans up your is_prime() implementation time-wise and uses a generator expression to avoid creating a list of primes:

def is_prime(number):
    if number <= 2 or number % 2 == 0:
        return number == 2

    for divisor in range(3, int(number ** 0.5) + 1, 2):
        if number % divisor == 0:
            return False

    return True

result = sum(number for number in range(2, 2_000_000) if is_prime(number))

print(result)

This completes the task in about 10 seconds, well within your minute or two limit, and doesn't take much code. It isn't optimal time-wise, just better time-wise, and reasonably optimal space-wise.

REVISIT

There is another way a generator can provide time improvement beyond what I describe above. Unlike is_prime(), which can be passed any number at any time, a generator can guarantee that it will be working with ascending numbers so it could make simplifying assumptions. Similarly, it can maintain state between calls, unlike is_prime() as implemented. Let's rework this problem by generating the primes:

def prime_generator(limit):
    yield 2

    number = 3

    while number <= limit:
        for divisor in range(3, int(number ** 0.5) + 1, 2):
            if number % divisor == 0:
                break
        else:  # no break
            yield number

        number += 2

print(sum(prime_generator(2_000_000)))

Playing with various arrangements of this approach, it provides a 5% speed up at best over my original solution.

THE SIEVE

Finally, let's solve this problem using a sieve. This approach uses more space than the solutions above to gain performance time-wise:

def sum_primes(limit):  # assumes limit > 1
    sieve = [False, False, True] + [True, False] * ((limit - 1) // 2)
    number = 3
    result = 2

    while number <= limit:
        if sieve[number]:
            result += number

            for i in range(number * number, limit + 1, number):
                sieve[i] = False
        number += 2

    return result

print(sum_primes(2_000_000))

This sums the primes in less than 1 second on my system. It's 15x faster than the previous generator-based solution.

My understanding of generators is that they effectively put a portion of the loop on hold while yielding a result to be utilized. So if I'm attempting to cycle through a list of 2 mil. prime #s, and use each number in a calculation (i.e. sum), this would seem an ideal opportunity. I agree with you that the list option I presented is not the best (just happened to be what I last implemented - hence why I mentioned the alternatives I'd used). What I'm understanding by your solution, is that the two if statements and generator statement above would be faster than two generator statements? — alofgran, Aug 30 '19 at 01:38
@onhamae, it may be an opportunity for using a generator, but it won't necessarily be optimized by the use of a generator. I threw in a generator expression where it would do some good. I did not convert the function to a generator as it wouldn't make a difference. If you really want to optimize the code, use a *sieve* instead. — cdlane, Aug 30 '19 at 01:45
There is a generator in this code, though it's not a function. The expression `number for number in range(2, 2_000_000) if is_prime(number)` is a generator expression (though you'd need to add parentheses if you were declaring it at top level, not as the only argument to a function call). — Blckknght, Aug 30 '19 at 04:34
In case anyone needed documentation, [here](https://www.python.org/dev/peps/pep-0289/) is part of my rationale for focusing on generators (expressions, in particular). Cited benefits are memory conservation, "particularly with the use of functions like sum()...that reduce an iterable input to a single value". — alofgran, Aug 30 '19 at 05:38
@onhamae, I've added a couple of additional solutions to my answer -- a generator-based one along the lines I believe you desired, and a sieve-based one for speed comparison. — cdlane, Aug 30 '19 at 06:25

John Coleman · Answer 5 · 2019-08-30T11:18:28.047

Here is a generator which uses a hybrid boot-strap approach. It uses a (not particularly efficient) sieve to identify the primes below the square root, storing them as it yields them, and then use these for trial division of the remaining odd numbers below n. For n = 2_000_000, it never stores more than around 700 numbers, so it has a smallish memory footprint (at the cost of more processing time):

import math

def primes(n):
    k = 1 + int(math.sqrt(n))
    #phase 1: sieve to k
    if n >= 2:
        yield 2
        small_primes = [2]
        candidates = [2*i + 1 for i in range(1,(k+1)//2)]
        while len(candidates) > 0:
            p = candidates[0]
            small_primes.append(p)
            candidates = [x for x in candidates if x % p != 0]
            yield p
    #at this stage we have all primes below k
    #loop through remaining odd numbers
    #dividing by these primes
    if k%2 == 0: k +=1
    while k <= n:
        if all(k%p != 0 for p in small_primes): yield k
        k += 2

I didn't bother to time it, but sum(primes(2_000_000)) takes around 3 seconds. The reason I didn't bother to time it is because I didn't want to embarrass it when compared to the code of Blckkght -- which shows just how fast a non-generator optimized sieve approach can be.

score 1 · Answer 6 · answered Sep 11 '19 at 22:44

Here is a very fast pure Python prime generator created by Willy Good, found in a comment here. This may be overkill as far as performance and complexity for your particular use case, but I don't think many of the stackoverflow primes in Python people are aware of it.

def primes235(limit):
    yield 2; yield 3; yield 5
    if limit < 7: return
    modPrms = [7,11,13,17,19,23,29,31]
    gaps = [4,2,4,2,4,6,2,6,4,2,4,2,4,6,2,6] # 2 loops for overflow
    ndxs = [0,0,0,0,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5,5,5,6,6,7,7,7,7,7,7]
    lmtbf = (limit + 23) // 30 * 8 - 1 # integral number of wheels rounded up
    lmtsqrt = (int(limit ** 0.5) - 7)
    lmtsqrt = lmtsqrt // 30 * 8 + ndxs[lmtsqrt % 30] # round down on the wheel
    buf = [True] * (lmtbf + 1)
    for i in xrange(lmtsqrt + 1):
        if buf[i]:
            ci = i & 7; p = 30 * (i >> 3) + modPrms[ci]
            s = p * p - 7; p8 = p << 3
            for j in range(8):
                c = s // 30 * 8 + ndxs[s % 30]
                buf[c::p8] = [False] * ((lmtbf - c) // p8 + 1)
                s += p * gaps[ci]; ci += 1
    for i in xrange(lmtbf - 6 + (ndxs[(limit - 7) % 30])): # adjust for extras
        if buf[i]: yield (30 * (i >> 3) + modPrms[i & 7])

A speed comparison to Robert William Hanks' best pure Python solution, which is more compact and easier to understand:

$ time ./prime_rwh2.py 1e7
664579 primes found < 1e7

real    0m0.883s
user    0m0.266s
sys     0m0.047s
$ time ./prime_wheel.py 1e7
664579 primes found < 1e7

real    0m0.285s
user    0m0.234s
sys     0m0.063s

Willy Good's solution is a mod 30 wheel sieve which avoids using/storing multiples of 2, 3, and 5 except to manually yield them to make it complete. It works great for me up to about 2.5e9 where the 8G of RAM in my laptop is totally used up and the system thrashes.

score 0 · Answer 7 · answered Aug 30 '19 at 02:06

0

The result of get_sum_of_primes_in_range(0, constant_max_value) will be a constant, which can be pre-computed.

The result of get_sum_of_primes_in_range(0, n+x) can be done as get_sum_of_primes_in_range(0, n) + get_sum_of_primes_in_range(n, x).

By combining these things; you can have a table of pre-computed results for selected values of n and only use processing time to find the get_sum_of_primes_in_range(n, x) part.

Basically; instead of doing get_sum_of_primes_in_range(0, x); you can do k = x / 100 and n = k * 100, result = table[k] + get_sum_of_primes_in_range(n, x) and skip a massive amount of work; where the amount of work you expect to be able to skip (on average) depends on how large you want to make that table of pre-computed results.

For get_sum_of_primes_in_range(n, x) you want something based on "Sieve of Eratosthenes" (see https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes ). Note that Sieve of Eratosthenes can be started from an arbitrary value using modulo and needn't start from 0.

answered Aug 30 '19 at 02:06

Brendan

26,293
1
28
50

This seems to be describing an answer to a different question than what was asked here. There's no need to compute the sums of some subset of the primes, just all primes up to a (maybe constant?) number. – Blckknght Aug 30 '19 at 04:37
So, plugging in some real numbers, here's what I'm understanding: If I'm looking to get the sum of primes in a range of 02 - 2 mil., then k = 2000000/100, and n = 2000000? Is my math off? Because that means something doesn't seem to be adding up... – alofgran Aug 30 '19 at 05:30
@onhamae: looks right to me - in that case you'd use the table to find `get_sum_of_primes_in_range(0, 2000100)` instead of doing any real work, then you'd find `get_sum_of_primes_in_range(2000000, 200000)` which also does nothing; so you'd get the result almost instantly. If you want sum of primes from 0 to 2000123, then `k = 20001` and `n = 2000100`, so you'd use the table to find `get_sum_of_primes_in_range(0, 2000100)` with no real work, then do `get_sum_of_primes_in_range(2000100, 2000123)`, which is only 23 values, so it won't be almost instant (but won't be extremely slow). – Brendan Aug 30 '19 at 10:56
@onhamae: If course if blckknght is correct and you only care about finding sum of primes up to 2000000, then you only need a single `print precomputed_constant` and it will always be almost instant (even if you use your existing code to determine the constant once and "cut & paste" the value). – Brendan Aug 30 '19 at 11:04

score -1 · Answer 8 · answered Aug 30 '19 at 01:35

-1

First, I have a math background. Second, this uses Fermant's Little Theroem (though I am unsure of the name, I ahd forgotten). I just Googlged and spent a lot of time coding and debugging. Here it is!

'''

def is_prime():

a = int(input("Eneter any number"))

for p in range(2, a + 1):

    if (a % p == 0):
        isprime = 1
        for j in range(2, (p // 2 + 1)):

            if(p % j == 0):
                isprime = 0
                break

        if (isprime == 1):
            print(" %d is a Prime Factor of  %d" %(p, a))

is_prime()

'''

Have a great day!

answered Aug 30 '19 at 01:35

Ismail Ahmed

41
2
6

Does your math background explain why this is not even wrong? It doesn't solve the problem the OP describes. – cdlane Aug 30 '19 at 01:40
1

How is this Fermat's Little Theorem? It doesn't use modular exponentiation. It just seems to be brute-force trial division algorithm, one that does much more division than needed. – John Coleman Aug 30 '19 at 03:46

Can I optimize my prime number summing function with generators?

8 Answers8