0

for this assignment, I have to create a program that finds and returns a queue of all prime numbers up to integer n using the Sieve of Eratosthenes. While I got it to work, I am still pondering about the big O for my code. It seems like I've nested too many loops into it making it run too slow when calculating very large numbers. For instance, calculating:

Queue<Integer> q = PrimeQueue.getPrimes(1234567);

Takes nearly three minutes! What I'm asking is: Is such a long run-time expected for a task such as this? Is there is a more process-friendly way to do this within the confines of this assignemnt(While still using two queues and returning one of them)? What is the complexity class of this program? Thanks!

Here's my code:

public static Queue<Integer> getPrimes(int n) {
    if (n < 2) { // If illegal argument
        throw new IllegalArgumentException("n can't be less than 2.");
    }
    Queue<Integer> numbers = new LinkedList<Integer>();
    Queue<Integer> primes = new LinkedList<Integer>();

    // fills queue w/ numbers from 2 to n inclusive // throw exception if n < 2
    for (int i = 2; i <= n; i++) {
        numbers.add(i);
    }

    do { // do while b/c it will always execute once
        int prime = numbers.remove(); // gets first prime number from first value of numbers
        primes.add(prime); // adds to the prime 
        int numbersSize = numbers.size(); // used to keep track when looping over the numbers queue
        for (int i = 0; i < numbersSize; i++) { // goes through each number to elim multiples of prime
            int numTemp = numbers.remove(); // will add to back of numbers if not divisible by prime (otherwise deleted)
            if (numTemp % prime != 0) {
                numbers.add(numTemp); // put back into back of queue
            }
        }
    } while (!numbers.isEmpty()); // Keeps running as long as there is a value in numbers
    return primes;
}
Jack L.
  • 385
  • 1
  • 11
  • 30

2 Answers2

1

"Is such a long run-time expected for a task such as this?" No.

"Is there a more process-friendly way to do this within the confines of this assignment (using two queues and returning one of them)?" Yes, here's how.

You can stop much earlier, and thus make your code run much much faster:

int numTemp = 0;
int prime = 0;
do { 
    prime = numbers.remove(); 
    numTemp = prime;  // in case numbers are now empty
    primes.add(prime); 
    int numbersSize = numbers.size(); 
    for (int i = 0; i < numbersSize; i++) { 
        numTemp = numbers.remove(); 
        if (numTemp % prime != 0) {
            numbers.add(numTemp); 
        }
    }
} while (prime*prime < numTemp); // numTemp is the last number in numbers

// now all numbers left in 'numbers' are prime!!
.... ;  // add them into your 'primes' queue, and then
return primes;

It's OK to stop this early, because if n == ab and a <= b, then aa <= ba == n, i.e. aa <= n.

You can assess the run-time orders of growth empirically, by calculating a = log(t2/t1) / log(n2/n1) for a coefficient a in time ~ n^a, after measuring run times t2, t1 at different "problem sizes" n2, n1.

I suspect your code is running at ~ n2, and with my correction – at about ~ n1.5 (or just below that, according to this artice), in n primes produced (n ~= N / log N, N ~= n log n), which should manifest itself as a huge speedup, in practice (I'd hazard a guess of at least 100x speedup for your problem of primes up to N = 1234567, so, not much than a second or two instead of your 3 minutes).

I also suspect that removing and re-adding the numbers (which just implements a filter operation) from/to a queue should have no performance impact whatever in itself, because queues are supposed to support both operations in O(1) time (you remove each number, and add some of them back at queue's end - such that pass a test).

And for your algorithm (which is a trial division one) the queues are perfectly OK, you test each number separately anyway. Random access is only needed if you switch to a much more efficient Sieve of Eratosthenes which enumerates the multiples by repeated addition, and marks them through direct memory access using number's value as an address.

(see also: sieve of Eratosthenes in Python).

Community
  • 1
  • 1
Will Ness
  • 62,652
  • 8
  • 86
  • 167
-1

I suspect the fact that you're removing all composite numbers and re-adding them each time is causing major slowdown. You're effectively iterating over the numbers queue O(n^2) times (maybe closer to O(n*log(n)) I suppose) and repeatedly allocating and freeing queue elements. That's a lot of wasted cycles right there.

My suggestion would be that you're using the wrong data structure. You need something with random or associative access. My suggestion is to either allocate a Boolean array and mark all numbers that are known composites with each pass, or use a Set<Integer> which allows near-constant time access and quicker iteration at the end at the cost of slower removals each pass.

EDIT: and as pointed out by Will Ness, another major problem is you are iterating from 2 to n when you only need to iterate to sqrt(n). I missed that one-character detail because I was caught up in the other design issues, my mistake.

Platinum Azure
  • 41,456
  • 9
  • 100
  • 130
  • I know that using the queues are very limiting, but I'm confined to just using two queues—one to store integers from 2 to n, and one to store and return all primes from 2 to n. – Jack L. Jul 26 '14 at 04:45
  • If you're required to use the queues, consider storing a different object type containing the integer and a "known composite" variable and don't repeatedly remove and insert items. That should help your performance too. – Platinum Azure Jul 26 '14 at 13:48