3

I want to iterate through every number < N, and know each numbers prime factorization. My question is what is the best way to do this?

I am aware that I can I can use the trial division method to find the prime factorization of a given number, and just repeat that for every number less than N, but that is inefficient and takes longer than generating each number from the known prime factors. I have written an implementation below of generating every number that is less than N, from all of the prime factors that are less than N. Is there a faster way to do this? I am trying to use the fact that since I am doing this for all numbers less than N to save computation time, from instead doing the trial division method.

The goal that I am trying to accomplish: I have an algorithm which I want to run on every number that is less than N. For this algorithm, I need the prime factorization of each number. I am trying to get the prime factorization of each number in the minimum time. I don't actually need to store the prime factorizations, I just need to run my algorithm with the prime factorization. ( The algorithm is solve(curNum, curFactors) in the code)

I wrote a python3 program to recursively generate each number with knowledge of its prime factors, but its quite slow. (It takes ~58 seconds of processing time when N = 10^7. The function solve is doing nothing for this benchmark. )

curFactors is a list where every even element is the index of the prime in the factorization, and each odd element is that primes exponent. I flattened it from a list of lists to save computation time. The prime start index is used to ensure that I don't double count numbers. Currently Solve does nothing, just so I can benchmark this function.

def iterateThroughNumbersKnowingFactors(curNumber, curFactors, primeStartIndex):
    #Generate next set of numbers
    #Handle multiplying by a prime already in the factorization seperately.
    for i in range(primeStartIndex+1,lenPrimes):
        newNum = curNumber * primelist[i]
        if(newNum > upperbound):
            break
        newFactors = curFactors[:]
        newFactors.append(i)
        newFactors.append(1)
        #Do something with this number and its factors
        solve(newNum, newFactors)
        #Go get more numbers
        iterateThroughNumbersKnowingFactors(newNum,newFactors,i)
    if(primeStartIndex > -1):
        newNum = curNumber * primelist[primeStartIndex]
        if(newNum > upperbound):
            return
        currentNumPrimes = len(curFactors)
        curFactors[currentNumPrimes-1] += 1
        #Do something with this number and its factors
        solve(newNum, curFactors)
        #Go get more numbers
        iterateThroughNumbersKnowingFactors(newNum,curFactors,primeStartIndex)

upperbound = 10**7

#https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n
primelist = primesfrom2to(upperbound+1)
lenPrimes = len(primelist)

t0 = time.clock()
iterateThroughNumbersKnowingFactors(1,[],-1)
print(str(time.clock() - t0) +" seconds process time")

Does anyone know of a better way to do this?

martineau
  • 99,260
  • 22
  • 139
  • 249
Ninja_Coder
  • 234
  • 1
  • 9
  • Can you add a clearer explanation of what exactly you're trying to accomplish? You want to generate a list of lists of integers which contain the prime factorization of the value of the index of that list up to some limit? – Sean Pianka Jan 21 '17 at 19:58
  • My apologies for being unclear. I have an algorithm which I want to run on every number that is less than N. For this algorithm, I need the prime factorization of each number. My question is what is the fastest way to get the prime factorization of every number less than N. I don't actually need to store the prime factorizations in memory, I just need to run the algorigthm with the prime factorization. – Ninja_Coder Jan 21 '17 at 20:01
  • So, for example: `primes = []; for i in range(upper_bound): primes.append(prime_factorization(i))` is, in essence, what you're trying to do? So, you're looking for an optimized Python implementation of prime factorization? – Sean Pianka Jan 21 '17 at 20:04
  • Yes! I am trying to look for a more optimized way to do it. An answer in any language is great, I can convert it to Python. I don't actually know of a more optimized way than what I am doing. – Ninja_Coder Jan 21 '17 at 20:06
  • [Python Finding Prime Factors](https://stackoverflow.com/questions/15347174/python-finding-prime-factors) and [Prime factorization - list](http://stackoverflow.com/a/16996439/4562156) – Sean Pianka Jan 21 '17 at 20:16
  • Possible duplicate of [Python Finding Prime Factors](http://stackoverflow.com/questions/15347174/python-finding-prime-factors) – Sean Pianka Jan 21 '17 at 20:16
  • I am trying to use the fact that I am finding the prime factorization of all numbers less than N to save computation time. Hence generating every number by combining the primes saves more time then just doing the trial division method over all numbers. Using the trial division method that you are proposing over all numbers < 10**7 takes far longer. – Ninja_Coder Jan 21 '17 at 20:34
  • Do you need to iterate over all the numbers in order? – rici Jan 22 '17 at 02:06
  • No I don't, order does not matter! – Ninja_Coder Jan 22 '17 at 04:41

3 Answers3

5

If you've already got the Sieve of Eratosthenes implemented and the performance of that is acceptable, then you can modify it to store prime factors.

The basic approach is this: whenever you would "cross off" a number or remove it from the list for being a multiple of a prime, instead check how many times you can divide it by the prime without remainder (use / and %). That will give you a (prime, exponent) pair representing that component of the prime factorization. Store those pairs in a list or dictionary associated with the original number. When the Sieve finishes, each list will describe the prime factorization of the corresponding number.

hnau
  • 79
  • 5
  • My mistake, thanks for pointing that out. I've edited the answer to correct it. I think the approach I described requires the careless, N log N implementation. – hnau Jan 21 '17 at 21:23
  • No need to divide; iterate on two variables in the sieve's inner loop. One incremented by the prime, the other by one. – Yann Vernier Jan 22 '17 at 09:10
1

If you wanted to have a little fun, you could use Bach's algorithm to generate a random number in the interval [1,N] in O(log(N)) time. Repeating this until you find the prime factorization of all numbers less than N could take theoretically infinite time, but the expected running time of the algorithm would be O(log^2(n)).

This might be a bit of a loss efficiency-wise, but if you want a fun algorithm that doesn't iterate in a linear order then this might be your best bet :)

Martin Brisiak
  • 2,957
  • 12
  • 32
  • 48
Milo Moses
  • 111
  • 3
1

Using some inspiration from the sieve of Eratosthenes, you can build the list of factors by propagating prime numbers to a list of prime factor lists up to N:

To only know which primes are present:

def primeFactors(N):
    result = [[] for _ in range(N+1)]  # lists of factors from 0..N
    for p in range(1,N+1,2):
        if p<2: p=2 
        if result[p]: continue         # empty list is a prime 
        for k in range(p,len(result),p):
                result[k].append(p)    # propagate it to all multiples
    return result

print(*enumerate(primeFactors(10)))
# (0, []) (1, []) (2, [2]) (3, [3]) (4, [2]) (5, [5]) (6, [2, 3]) (7, [7]) (8, [2]) (9, [3]) (10, [2, 5])

To get every instance of each primes in the factorisation:

def primeFactorsAll(N):
    result = [[] for _ in range(N+1)]
    for p in range(1,N+1,2):
        if p<2: p=2
        if result[p]: continue
        pn = p
        while pn < N:
            for k in range(pn,len(result),pn): # propagate to multiples of
                result[k].append(p)            # each power of p
            pn *= p
    return result

print(*enumerate(primeFactorsAll(10)))
# (0, []) (1, []) (2, [2]) (3, [3]) (4, [2, 2]) (5, [5]) (6, [2, 3]) (7, [7]) (8, [2, 2, 2]) (9, [3, 3]) (10, [2, 5])

For a large N, this should run much faster than a division approach. For N= 10^7, on my laptop, primeFactors(N) takes 8.1 seconds and primeFactorsAll(N) takes 9.7 seconds.

Alain T.
  • 24,524
  • 2
  • 27
  • 43