I was solving some problems on project euler and I wrote identical functions for problem 10...

The thing that amazes me is that the C solution runs in about 4 seconds while the python solution takes about 283 seconds. I am struggling to explain to myself why the C implementation is so much faster than the python implementation, what is actually happening to make it so?


#include <stdio.h>
#include <time.h>
#include <math.h>

int is_prime(int num)
    int sqrtDiv = lround(sqrt(num));
    while (sqrtDiv > 1) {
        if (num % sqrtDiv == 0) {
        } else {

int main () 
    clock_t start = clock();

    long sum = 0;
    for ( int i = 2; i < 2000000; i++ ) {
        if (is_prime(i)) {
            sum += i;
    printf("Sum of primes below 2,000,000 is: %ld\n", sum);

    clock_t end = clock();
    double time_elapsed_in_seconds = (end - start)/(double)CLOCKS_PER_SEC;
    printf("Finished in %f seconds.\n", time_elapsed_in_seconds);   


from math import sqrt
import time

def is_prime(num):
    div = round(sqrt(num))
    while div > 1:
        if num % div == 0:
            return False
        div -= 1
    return True

start_time = time.clock()

tsum = 0
for i in range(2, 2000000):
    if is_prime(i):
        tsum += i

print tsum
print('finished in:', time.clock() - start_time, 'seconds')
  • 1,575
  • 1
  • 20
  • 31
  • 7,789
  • 10
  • 41
  • 86
  • 10
    If you're using Python 2.7, `range(2, 2000000)` actually builds an in-memory list of about 2000000 integers. You aren't doing the same equivalent in C. Try `xrange()` instead, or switch to Python 3, where `range()` is a lazy iterator. – Akshat Mahajan Jul 19 '16 at 00:20
  • Static type declarations and possibly using a memory-inefficient iterator vs. a generator in python – Dan Jul 19 '16 at 00:22
  • 1
    `div` is float in your python code, but `sqrtDiv` is int in your C code. – Paul Hankin Jul 19 '16 at 00:22
  • `round(sqrt(num)) -> int(sqrt(num) + 1)` gives a 2.5x speed increase. I don't think range vs xrange makes any difference in this case. – Paul Hankin Jul 19 '16 at 00:28
  • @PaulHankin After testing an `xrange()` version, I am forced to agree. – Akshat Mahajan Jul 19 '16 at 00:29
  • Could [this answer](http://stackoverflow.com/a/3033379/4520911) be relevant? – iRove Jul 19 '16 at 00:32
  • 2
    Fixing the int/float error I mentioned above leaves the python version 21 times slower than the C. That's about in the right ballpark for code that's doing lots of sums on small ints. – Paul Hankin Jul 19 '16 at 00:35
  • `int(round(sqrt(num)))` is the correct replacement, not what I wrote above. – Paul Hankin Jul 19 '16 at 00:38
  • you should really use numpy for numerical code in python. Also why would you expect python to be as fast as C? – s952163 Jul 19 '16 at 00:38
  • @PaulHankin Actually, doing `int(sqrt(num))` is sufficient, as `int` will always end up effectively rounding down, which is what is wanted for primality testing. – Akshat Mahajan Jul 19 '16 at 00:44
  • @AkshatMahajan I don't think it's guaranteed that `math.sqrt(x)` returns an exact int when the input is a square, but perhaps it is. If it's not, then `int(sqrt(p*p))` could evaluate to `p-1`, and cause `p*p` to be identified as prime. – Paul Hankin Jul 19 '16 at 00:53
  • @PaulHankin - Python says " It provides access to the mathematical functions defined by the C standard." As long as the underlying C library conforms to IEEE-754 (which it should), and the output is representable in a double it will. – TLW Jul 19 '16 at 03:43
  • @PaulHankin - Note that even `int(round(sqrt(num)))` can and will break on large numbers. E.g. `int(round(sqrt((10**200))))`. – TLW Jul 19 '16 at 03:47
  • You really should be using numpy arrays instead of lists (better batch performance) and a sieve more like these: http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n?noredirect=1&lq=1 – HAL 9001 Jul 19 '16 at 04:47

1 Answers1


It's CPython (the implementation) that's slow in this case, not Python necessarily. CPython needs to interpret the bytecode which will almost always be slower than compiled C code. It simply does more work than the equivalent C code. In theory each call to sqrt for example requires looking up that function, rather than just a call to a known address.

If you want comparable speed from Python, you could either annotate the source with types and compile with Cython, or try running with Pypy for some JIT performance.

  • 30,857
  • 7
  • 96
  • 176