0

There is a python script that does prime factorization. It's very fast and runs in less than a second. But there are some functions for php that run very slow. It take one parameter (a long integer) like 1278426847636566097 and calculate prime factorization. Returns an array with 2 indexes. The result for this number is:

Array ( [0] => 1233387599 [1] => 1036516703 )

The python script: (getpq.py)

#!/usr/bin/env python
from __future__ import print_function
import prime
import sys
import json

def eprint(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)


pq = prime.primefactors(long(sys.argv[1]))

sys.stdout.write(json.dumps(pq))
sys.stdout.flush()

prime.py:

# NOTICE!!! This is copied from https://stackoverflow.com/questions/4643647/fast-prime-factorization-module

import random

def primesbelow(N):
    # http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    #""" Input N>=6, Returns a list of primes, 2 <= p < N """
    correction = N % 6 > 1
    N = {0:N, 1:N-1, 2:N+4, 3:N+3, 4:N+2, 5:N+1}[N%6]
    sieve = [True] * (N // 3)
    sieve[0] = False
    for i in range(int(N ** .5) // 3 + 1):
        if sieve[i]:
            k = (3 * i + 1) | 1
            sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1)
            sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1)
    return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]]

smallprimeset = set(primesbelow(100000))
_smallprimeset = 100000

def isprime(n, precision=7):
    # http://en.wikipedia.org/wiki/Miller-Rabin_primality_test#Algorithm_and_running_time
    if n == 1 or n % 2 == 0:
        return False
    elif n < 1:
        raise ValueError("Out of bounds, first argument must be > 0")
    elif n < _smallprimeset:
        return n in smallprimeset


    d = n - 1
    s = 0
    while d % 2 == 0:
        d //= 2
        s += 1

    for repeat in range(precision):
        a = random.randrange(2, n - 2)
        x = pow(a, d, n)

        if x == 1 or x == n - 1: continue

        for r in range(s - 1):
            x = pow(x, 2, n)
            if x == 1: return False
            if x == n - 1: break
        else: return False

    return True

# https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/
def pollard_brent(n):
    if n % 2 == 0: return 2
    if n % 3 == 0: return 3

    y, c, m = random.randint(1, n-1), random.randint(1, n-1), random.randint(1, n-1)

    g, r, q = 1, 1, 1
    while g == 1:
        x = y
        for i in range(r):
            y = (pow(y, 2, n) + c) % n

        k = 0
        while k < r and g==1:
            ys = y
            for i in range(min(m, r-k)):
                y = (pow(y, 2, n) + c) % n
                q = q * abs(x-y) % n

            g = gcd(q, n)
            k += m
        r *= 2
    if g == n:
        while True:
            ys = (pow(ys, 2, n) + c) % n
            g = gcd(abs(x - ys), n)
            if g > 1:
                break

    return g

smallprimes = primesbelow(10000) # might seem low, but 1000*1000 = 1000000, so this will fully factor every composite < 1000000
def primefactors(n, sort=False):
    factors = []

    limit = int(n ** .5) + 1
    for checker in smallprimes:
        if checker > limit: break
        while n % checker == 0:
            factors.append(checker)
            n //= checker
            limit = int(n ** .5) + 1
            if checker > limit: break

    if n < 2: return factors

    while n > 1:
        if isprime(n):
            factors.append(n)
            break
        factor = pollard_brent(n) # trial division did not fully factor, switch to pollard-brent
        factors.extend(primefactors(factor)) # recurse to factor the not necessarily prime factor returned by pollard-brent
        n //= factor

    if sort: factors.sort()

    return factors

def factorization(n):
    factors = {}
    for p1 in primefactors(n):
        try:
            factors[p1] += 1
        except KeyError:
            factors[p1] = 1
    return factors

totients = {}
def totient(n):
    if n == 0: return 1

    try: return totients[n]
    except KeyError: pass

    tot = 1
    for p, exp in factorization(n).items():
        tot *= (p - 1)  *  p ** (exp - 1)

    totients[n] = tot
    return tot

def gcd(a, b):
    if a == b: return a
    while b > 0: a, b = b, a % b
    return a

def lcm(a, b):
    return abs(a * b) // gcd(a, b)

As I don't know python, is there any way to do this is php?

NOTE: I found a very bad algorithm, implemented in php, takes about 600 seconds:

public function primefactor($num) {
    $sqrt = sqrt($num);
    for ($i = 2; $i <= $sqrt; $i++) {
        if ($num % $i == 0) {
            return array_merge($this->primefactor($num/$i), array($i));
        }
    }
    return array($num);
}
Vahid Najafi
  • 2,703
  • 7
  • 24
  • 53
  • This works like python code , but as it optimized in C it's way to much faster , but I'm not sure there is a better way to find prime factors , it's a general solution. – ᴀʀᴍᴀɴ Feb 05 '17 at 20:46
  • 1
    Factoring is a vast topic which has been much studied in recent decades. There are numerous algorithms of varying degrees of sophistication that have been implemented in most common programming languages. The php one you give is a naïve trial-division, which is one of the worst possible algorithms. Look at the Wikipedia page on factoring. This question is too broad for Stack Overflow (or is off-topic to the extent that it could be read as asking for a php library similar to the Python `prime` module). – John Coleman Feb 05 '17 at 20:49

1 Answers1

1

Of source there is. You need to research "prime factorization" algorithms. If you add PHP as a requirement to the search, you can find code for better implementations than the brute-force method above. Start here.

Note that the Python method doesn't show anything about the algorithm: the prime package has built-in algorithms, and the code you gave merely invokes that. I've read that the Python package uses probabilistic methods (which are tested to be 100% accurate through something like 300-digit numbers). Keep an eye out for those when you do your research. There will be implementations available in several general-purpose languages; I know of at least Python, Java, and C++.

Prune
  • 72,213
  • 14
  • 48
  • 72
  • Yes, I forgot to mention about **prime** package. I edited my post. But about your mentioned link, I've found it before, but the max `n` is *2^31-1 = 2147483647* as mentioned there. – Vahid Najafi Feb 05 '17 at 20:59
  • That's a restriction from the data type. For your application, you'll need to change over to a long-integer type. – Prune Feb 05 '17 at 22:11