Let's consider a list of large integers, for example one given by:
def primesfrom2to(n):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
""" Input n>=6, Returns a array of primes, 2 <= p < n """
sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
sieve[0] = False
for i in xrange(int(n**0.5)/3+1):
if sieve[i]:
k=3*i+1|1
sieve[ ((k*k)/3) ::2*k] = False
sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]
primesfrom2to(2000000)
I want to calculate the sum of that, and the expected result is 142913828922. But if I do:
sum(primesfrom2to(2000000))
I get 1179908154, which is clearly wrong. The problem is that I have an int overflow, but I don't understand why. Let's me explain.Consider this testing code:
a=primesfrom2to(2000000)
b=[float(i) for i in a]
c=[long(i) for i in a]
sumI=0
sumF=0
sumL=0
m=0
for i,j,k in zip(a,b,c):
m=m+1
sumI=sumI+i
sumF=sumF+j
sumL=sumL+k
print sumI,sumF,sumL
if sumI<0:
print i,m
break
I found out that the first integer overflow is happening at a[i=20444]=225289
If I do:
>>> sum(a[:20043])+225289
-2147310677
But if I do:
>>> sum(a[:20043])
2147431330
>>> 2147431330+225289
2147656619L
What's happening? Why such a different behaviour? Why can't sum switch automatically to long type and give the correct result?