What is the space complexity of a prime sieve with data in proportion to number of primes?

Question

I'm practicing writing algorithms optimized for space or time complexity. With prime sieves, at minimum you must store a list of all primes found. It seems data in proportion to the number of primes found is the least amount of space such an algorithm may use.

Is this rationale valid?
How would the space complexity for this algorithm be evaluated?

From Wikipedia about the sieve of Atkin - What I'm unsure about is how a sieve can use O(n^1/2) space when the number of primes exceeds this. This is why it seems at minimum the space must be proportional to the number of primes. Am I confusing countable numbers with space complexity?

In this paper on the sieve of Atkin, their algorithm prints "the prime numbers up to N...Here “memory” does not include the paper used by the printer." This seems like an unfair calculation of space.

I would appreciate clarification on how this should be / is actually measured objectively.

def prime_sieve(limit):
    factors = dict()
    primes = []
    factors[4] = (2)
    primes.append(2)
    for n in range(3, limit + 1):
        if n not in factors:
            factors[n * 2] = (n)
            primes.append(n)
        else:
            prime = factors.get(n)
            m = n + prime
            while m in factors:
                m += prime
            factors[m] = (prime)
            del factors[n]
    return primes

Please don't add new questions into the question - it invalidates existing answers. Each question should contain only a single question; if you have more, ask more. You already have two answers which address different parts of the question, so how can you decide which one should be "the answer"? — jonrsharpe, Oct 20 '14 at 11:17
That's a good point. Idk if it's appropriate here to break off the question about the Atkin paper...? @larsmans explained that my algorithm is fundamentally different from the paper's, which is why they have different complexities. — 12345678910111213, Oct 20 '14 at 11:42
It would be a bit difficult at this point, although you could ask *larsmans* if they'd mind duplicating their answer on a new question. — jonrsharpe, Oct 20 '14 at 12:32
[this answer](http://stackoverflow.com/a/10733621/849891) demonstrates a sieve of Eratosthenes with space complexity of O( sqrt(N)/log(N) ) - not counting the produced primes of course. — Will Ness, Oct 20 '14 at 21:01
N above is the upper limit, so N ~= n log(n) and the complexity in that answer is O( sqrt(n log(n))/log(n)) = O( sqrt( n/log(n) )) for *n* primes. — Will Ness, Oct 20 '14 at 21:12

score 2 · Accepted Answer · edited May 23 '17 at 12:20

2

The space complexity for this algorithm is len(numbers) + len(primes); the size of the list plus the size of the dictionary.

In this case, the algorithm is worse than you'd expect for a naive prime sieve (limit). len(numbers) + len(primes) > limit because e.g. for prime_sieve(100) the following irrelevant numbers are stored in numbers:

{129: 43, 134: 67, 141: 47, 142: 71, 146: 73, 158: 79, 166: 83, 178: 89, 194: 97, 102: 17, 104: 2, 105: 3, 106: 53, 110: 11, 111: 37, 112: 7, 114: 19, 115: 23, 116: 29, 117: 13, 118: 59, 120: 5, 122: 61, 123: 41, 124: 31}

There are several prime number sieves, with varying time and space complexity; see e.g. Wikipedia and questions like How do i reduce the space complexity in Sieve of Eratosthenes to generate prime between a and b?

Also, note that there's no need for prime = numbers.get(n) - you've already checked if n not in numbers, so you can just use prime = numbers[n].

edited May 23 '17 at 12:20

Community

1
1

answered Oct 20 '14 at 10:15

jonrsharpe

99,167
19
183
334

From Wikipedia about the sieve of Atkin - What I was unsure about was how a sieve can use O(n^1/2) space when the number of primes exceeds this. This is why it seems at minimum the space must be proportional to the number of primes. Am I confusing countable numbers with space complexity? – 12345678910111213 Oct 20 '14 at 10:30
1

@mattkaeo I don't know the details, but the [Wiki page for Sieve of Atkin](http://en.wikipedia.org/wiki/Sieve_of_Atkin) links to [this paper](http://www.ams.org/journals/mcom/2004-73-246/S0025-5718-03-01501-1/S0025-5718-03-01501-1.pdf). – jonrsharpe Oct 20 '14 at 10:35
In the paper, it says they do not include the primes themselves in their space calculations..."print the prime numbers up to N...Here “memory” does not include the paper used by the printer." This seems unfair – 12345678910111213 Oct 20 '14 at 10:58
@mattkaeo that is only unfair if that *isn't* how other algorithms are measured for comparison. – jonrsharpe Oct 20 '14 at 11:00
I see what you're saying about the duplicate numbers. I forgot to add the del factors[n] line when I posted the question. Changed now – 12345678910111213 Oct 20 '14 at 13:01

Fred Foo · Answer 2 · 2014-10-20T11:17:23.153

The space complexity measurement is perfectly fair. If you replace primes.append(n) by yield n, and you process the primes one by one in a consumer routine without storing them all, for example to find a prime with a particular property, then the storage space required for the primes themselves is O(1), measured in the number of primes.

(yield is the Python way of constructing a generator, a type of co-routine that emits values to the caller and saves the state of the function so it can be re-entered.)

score 1 · Answer 3 · edited May 23 '17 at 12:27

" With prime sieves, at minimum you must store a list of all primes found. "

Incorrect. You only need primes below (and including) the square root of your upper limit, to sieve for the primes in that range.

And if your sieve is incremental, unbounded, then you only need to have primes below (and including) the square root of the current production point.

How is that possible? By using a separate primes supply for the "core" primes (those below the sqrt), which is perfectly OK to be calculated with the same function — recursively. For an example, see this answer.

And it's perfectly legit not to count the produced primes - you can indeed send them to the printer, or an external file, etc. So, the space complexity of such a sieve will be O( sqrt(N)/log(N) ) ~ O( sqrt( n/log(n) )) for n primes up to N ~= n * log n.

Also, don't go near the sieve of Atkin. Word on the street is, it's impossible to beat the properly wheel-erized sieve of Eratosthenes with it (search for answers by GordonBGood on the subject, like this one).

What is the space complexity of a prime sieve with data in proportion to number of primes?

3 Answers3