I wrote two codes with almost same structure,

def prime_gen1(Limit = 10000):
    List = [2,]
    for x in range(3,Limit):
        for y in List:
            if not x%y:
        if not x%y:
            yield x

def prime_gen2(Limit = 10000):
    from math import floor
    for x in range(3,Limit):
        for y in range(2, floor(x**0.5)+2):
            if not x%y:
        if not x%y:
            yield x

>>> list(prime_gen1(20000)) == list(prime_gen2(20000))
>>> def time1(number):
    st = time()
    end = time()
    return end - st

>>> def time2(number):
    st = time()
    end = time()
    return end - st

One does same work as other, but the latter actually works much faster. I'm wondering why this happens.

Logically - or nonlogically, I thought checking with primes will outdo the other way, in this case- checking by numers between 3 and root of number.But time-checking showed vice versa, checking with all the numers works much faster - about 5 times. Its performance increasingly differs,

>>> time1(200000)
>>> time2(200000)

Second method is outdoing it. What makes this different?

  • Try using `List = {2}` rather than `List = [2,]` and let us know. – fcracker79 Jan 03 '19 at 11:38
  • Side note: you can just use `int()` to get the floor of a positive float number. Also, testing up to `int(x ** 0.5) + 1` is enough, no need to go to `int(x ** 0.5) + 2`. – Martijn Pieters Jan 03 '19 at 11:44
  • @fcracker79: the list is only traversed, so a `set` does change nothing. – Daniel Jan 03 '19 at 12:02
  • your first checks each prime by all primes below it; but the second checks each prime by all numbers below its square root. that is a much stronger optimization. primes buy you a log factor, but square root loses you a square root factor, in complexity. `n^2` is much worse than `(n log n)^1.5` (in `n` primes produced). You can detect this by measuring [empirical orders of growth](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth) (as `log(n2/n1) / log(t2/t1)`), or drawing a log-log plot of run times vs problem sizes. – Will Ness Jan 03 '19 at 13:09
  • @Daniel if sets are enumerated in no particular order, then using a set there is actually worse, as any arbitrary number is much more likely to have a smaller prime factor, so factors must be tested in increasing order. – Will Ness Jan 03 '19 at 13:13

2 Answers2


the list version does a lot more checks than the one going only to a square root of the number

for limit 200000 the square root is ~447 there are 17983 prime numbers smaller than 200000

just add a count of how many times you do the x%y check like

def prime_gen1(Limit = 10000):
    List = [2,]
    modulo_checks = 0
    for x in range(3,Limit):
        for y in List:
            modulo_checks += 1
            if not x%y:
        if not x%y:
            yield x

def prime_gen2(Limit = 10000):
    from math import floor
    modulo_checks = 0
    for x in range(3,Limit):
        for y in range(2, floor(x**0.5)+2):
            modulo_checks += 1
            if not x%y:
        if not x%y:
            yield x

now for limit 200000 the version 1 does 162416226 checks and the second one 7185445

if you add an early break for looping of the list the list version is significantly faster(2times as fast 1799767 checks 0.24s vs 7185445 checks 0.64s)

    sq_root = floor(x ** 0.5) + 2
    for y in List:
        modulo_checks += 1
        if not x % y or y > sq_root:

and remove the math import if you want to compare algorithm speeds

  • Rather than use `math.floor()` you can use `int()`, and the limit can be set to `int(x ** 0.5) + 1`; the largest possible divider we care about is the floored square root, not the next number up. – Martijn Pieters Jan 03 '19 at 12:13
  • Next, you can remove the `if not x%y: continue` lines altogether, and attach the `else:` block to the `for` loop. It'll execute only when the `for` loop didn't see a `break`. – Martijn Pieters Jan 03 '19 at 12:14
  • 1
    yes, but that is offtopic, also the code style is not correct, refactoring could be done, prime sieve algorithm is faster etc... question was simply "why is this code faster than this one" - I try to do the smallest possible change to the original to show or answer the question as renaming and refactoring can be distracting especially for inexperienced programmers – Derte Trdelnik Jan 03 '19 at 12:16
  • Sure, I'm just pointing out there are more (smaller) inefficiencies that Python syntax could help out with. – Martijn Pieters Jan 03 '19 at 12:19
  • no worries, I just tried to explain why I will not incorporate your suggestions into my answer – Derte Trdelnik Jan 03 '19 at 12:30

Some better timings

%timeit list(prime_gen1(10**5))
2.77 s ± 204 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit list(prime_gen2(10**5))
219 ms ± 10.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You take a number of optimisation steps in your second algorithm but you don't in your first: here are some problems with your first algorithm.

def prime_gen1(Limit = 10000):
    List = [2,]
    for x in range(3,Limit):  # we're checking all even numbers too
        for y in List:  # only need to check up to sqrt(x)!!
            if not x%y:
        if not x%y:  # why are we checking again? Use a for-else construct
            List.append(x)  # just return the list at the end
            yield x  # when wrapped in list just copies List

Here is an optimised version of the first algorithm (not a generator because a generator from a list is just pointless):

def memeff_primelist(n):
    if n <= 2:
        return []
    primes = []  # Add 2 in at the end, don't need to check non-even
    for i in range(3, int(n), 2):
        for p in primes:
            if i % p == 0:  # non-prime
            if p * p > i:  # no factors bigger than sqrt(i)!!!
            primes.append(i)  # only for i == 3
    primes.insert(0, 2)
    return primes

%timeit memeff_primelist(10**5)
88.9 ms ± 16.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
  • See [Fastest way to list all primes below N](//stackoverflow.com/q/2068372) for better timings sill; [Robert William Hanks' `primes2()`](https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n/3035188#3035188) can beat your version with a wide margin (your sieve comes to 75.7 ms ± 1.61 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) on my machine, and `primes2()`, makes it 3.07 ms ± 84.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each), 20 times faster). The `numpy` versions measure in the microseconds. – Martijn Pieters Jan 03 '19 at 12:21
  • sqrt is an optimization, yes, but using the Sieve of Eratosthenes instead of Trial Division is a completely different algorithm. – Will Ness Jan 03 '19 at 12:24