0

My question is two-fold:

  1. Is there a way to both efficiently loop over and manipulate an array using enumerate for example and manipulate the loop at the same time?
  2. Are there any memory-optimized versions of arrays in python? (like NumPy creating smaller arrays with a specified type)

I have made an algorithm finding prime numbers in range (2 - rng) with the Sieve of Eratosthenes.

Note: The problem is nonexistent if searching for primes in 2 - 1,000,000 (under 1 sec total runtime too). In the tens and hundreds of millions this starts to hurt. So far changing the table from including all natural numbers to just odd ones, the rough maximum range I was able to search was 400 million (200 million in odd numbers).

Whiles instead of for loops decrease performance at least with the current algorithm.
NumPy while being able to create smaller arrays with type conversion, it actually takes roughly double the time to process with the same code, except

oddTable = np.int8(np.zeros(size))

in place of

oddTable = [0] * size

and using integers to assign values "prime" and "not prime" to keep the array type.

Using pseudo-code, the algorithm would look like this:

oddTable = [0] * size    # Array representing odd numbers excluding 1 up to rng

for item in oddTable:
    if item == 0:        # Prime, since not product of any previous prime
        set item to "prime"
        set every multiple of item in oddTable to "not prime"

Python is a neat language particularly when looping over every item in a list, but as the index in, say

for i in range(1000)

can't be manipulated while in the loop, I had to convert the range a few times to produce an iterable which to use. In the code: "P" marks prime numbers, "_" marks not primes and 0 not checked.

num = 1                  # Primes found (2 is prime)
size = int(rng / 2) - 1  # Size of table required to represent odd numbers
oddTable = [0] * size    # Array with odd numbers \ 1: [3, 5, 7, 9...]

new_rng = int((size - 1) / 3)    # To go through every 3rd item
for i in range(new_rng):         # Eliminate no % 3's
    oddTable[i * 3] = "_"
oddTable[0] = "P"                # Set 3 to prime
num += 1

def act(x):              # The actual integer index x in table refers to
    x = (x + 1) * 2 + 1
return x
        # Multiples of 2 and 3 eliminated, so all primes are 6k + 1 or 6k + 5
        # In the oddTable: remaining primes are either 3*i + 1 or 3*i + 2
        # new_rng to loop exactly 1/3 of the table length -> touch every item once
for i in range(new_rng):
    j = 3*i + 1                    # 3*i + 1
    if oddTable[j] == 0:
        num += 1
        oddTable[j] = "P"
        k = act(j)
        multiple = j + k    # The odd multiple indexes of act(j)
        while multiple < size:
            oddTable[multiple] = "_"
            multiple += k
    j += 1                         # 3*i + 2
    if oddTable[j] == 0:
        num += 1
        oddTable[j] = "P"
        k = act(j)
        multiple = j + k
        while multiple < size:
            oddTable[multiple] = "_"
            multiple += k
Felix
  • 2,166
  • 12
  • 35
  • Use the array standard library. – eguaio Oct 29 '16 at 16:59
  • SO isn't an advice forum. Please clarify your question. If you have a two-fold question, please ask two separate questions on SO so they can be answered easily. – Soviut Oct 29 '16 at 17:00
  • Split your algorithm in smaller chunks. – fralau Oct 29 '16 at 17:00
  • @Soviut I felt the need to give some context when asking the question. The questions would be harder to answer independently, don't you think? Thanks for the suggestion though! – Felix Oct 29 '16 at 17:12
  • `numpy` can store values more compactly, but is fast only when you use the fast compile numpy methods. Iterative stuff is likely to be slower. – hpaulj Oct 29 '16 at 17:37
  • Python primes were explored extensively in http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n – hpaulj Oct 29 '16 at 17:39

1 Answers1

0

To make your code more pythonic, split your algorithm in smaller chunks (functions), so that each chunk can be grasped easily.

My second comment might astound you: Python comes with "batteries included". In order to program your Erathostenes' Sieve, why do you need to manipulate arrays explicitly and pollute your code with it? Why not create a function (e.g. is_prime) and use the standard memoize decorator that was provided for that purpose? (If you insist on using 2.7, see also memoization library for python 2.7).

The result of the two pieces of advice above might not be the "most efficient", but it will (as I experienced with that exact problem) work well enough, while allowing you to quickly create sleek code that will save your programmer's time (both for creation and maintenance).

Community
  • 1
  • 1
fralau
  • 2,040
  • 18
  • 33