4

Any suggestions as to how I can get this program to work for n = 1 trillion (aside from making upgrades / buying a new computer)?

The error is as follows: after building, the program being executing (the command line style output window pops up) and then quickly closes out, and I get the following error "ProjectPrimes.exe has stopped working (Windows is looking for a solution to this problem." I suspect this has to do with memory issues as I first encountered it at n = 20 million but that was before I chose to malloc/free the 'sieve' array (i.e., my 'sieve' array is the big array of dimensions n x 1 and with each element consisting of a 1 or 0).

The program takes ~35 seconds to run through the first 300 million integers (16,252,325 primes) so it's okay but nothing spectacular. As I'd mentioned, the goal is to be able to generate primes below 1 trillion so am a long ways off...

If relevant, here are my machine specs (in case the goal happens to be unreasonable on this machine): 2.40ghz i5, 4gb RAM, 64bit Windows 7.

An overview of methodology, for those who aren't familiar: we use the Sieve of Sundaram approach. Without getting into the proof, we first eliminate all odd non-primes below an integer "n" using the sieve function: [2*(i+j+2*i*j)+1 | i <- [1..n/2], j <- [i..an optimized upper bound]]. We then cross off the even numbers (excluding two, of course). Which leaves us with the primes.

Why does the prime function return (a pointer to an array containing) the complete set of primes below n? Well, the goal is to be able to identify both (i) the number of primes below n as well as (ii) list the primes below n. This is also why I chose to pass a pointer for the count of primes below n as an argument.

Here's the not-so-exciting 'main' function:

int main() {
    long ceiling = 300*1000*1000;
    long *numPrimes;
    long *primes;

    primes = primesToSS(ceiling+1, numPrimes);
    printf("\n\nThere are %d primes below %d.\n\n",*numPrimes,ceiling);

    free(primes);
    return 0;
}

And here's the meat:

//n represents the ceiling, i.e., the integer below which we will generate primes
//cnt* is a pointer which will point the number of primes below n

long* primesToSS( long n, long* cnt ) {

    //initialize sieve by setting all elements equal to 1 (except for 0 and 1)
    long *sieve = malloc(n*sizeof(long));
    initArray(sieve, n, 1);
    sieve[0] = 0; sieve[1] = 0;

    //eliminate all odd composite numbers
    for (int i = 1; i <= n/2; ++i)
        for (int j = i; j <= (n-2*i)/(2*(2*i+1)); ++j)
            sieve[ 2*(i+j+2*i*j)+1 ] = 0;

    //eliminate all even numbers greater than two 
    //and count total number of primes below n
    long numPrimes = 1;
    for (int i = 3; i < n; ++i) {
        if (i % 2 == 0) sieve[i] = 0;
        numPrimes += sieve[i];
    }
    *cnt = numPrimes;

    //create array of primes
    long *primes = malloc(numPrimes*sizeof(int));
    long counter = 0;
    for (int i = 0; i < n; ++i) {
        if (sieve[i] == 1) {
            primes[counter] = i;
            counter++; } 
    }
    free(sieve);
    return primes;
}

void initArray( int* arr, int len, int n ) {
    for( int i = 0; i < len; ++i) arr[i] = n; }
iceman
  • 1,880
  • 1
  • 13
  • 21
  • What kind of error/exception occurs when your code crashes? – Fumu 7 Oct 06 '14 at 07:16
  • 6
    Please try to provide a minimal, compilable, working example that still shows the problem. Other than that, obviously **malloc(n*sizeof(long))** will return NULL for a large enough **n**. – Thomas Padron-McCarthy Oct 06 '14 at 07:29
  • I doubt you will be able to malloc a trillion longs unless you have 8 terabytes of ram – Atuos Oct 06 '14 at 07:30
  • 4
    Aren't your **sieve** array an array of flags, that is, zero or one? Why use a **long** to store a single bit? – Thomas Padron-McCarthy Oct 06 '14 at 07:33
  • With malloc-king memory of that size, you must have available c.a. 1GB of free AND continuous memory, to allocate it properly. Did you check for null being returned from malloc? No? Then do it and see yourself if that's the problem. – zubergu Oct 06 '14 at 07:33
  • 1
    The pointer `numPrimes` is uninitialized when you pass it to `primesToSS()`. Inside that function the line `*cnt = numPrimes;` will write to a random memory location. – Blastfurnace Oct 06 '14 at 07:33
  • Sry for having initially posted posting uncompilable code, code is now updated to include the initArray function which was previously missing. Also, I've elaborated on the nature of the error, as requested. Reading/processing your comments now. Thanks again, all – iceman Oct 06 '14 at 07:35
  • @DipakC: as long as you don't check the value returned by malloc, your question is rather uninteresting. – Mat Oct 06 '14 at 07:36
  • "Aren't your sieve array an array of flags, that is, zero or one? Why use a long to store a single bit?" @ThomasP-C *facepalm* great catch. edit: update. surprisingly, this didn't have much of an effect (350 million vs. 300 million)... – iceman Oct 06 '14 at 07:39
  • 2
    You should be using `%ld` in your formats to print `long` values. It also seems a bit excessive to use 64-bit `long` values to store 0 or 1 values. If you use single bits, you'll be able to store many more values in the sieve. You could also think about not bothering with storing any even numbers at all; the array of bits could simply encode the odd numbers (but it allows you to get 128 numbers into each element of the array where you currently only get 1). – Jonathan Leffler Oct 06 '14 at 07:50
  • @JonathanLeffler As I note in my answer, the OP mixes up int and long willy-nilly. – Jim Balter Oct 06 '14 at 07:51
  • @JimBalter: yup, but you didn't ping that one specifically. – Jonathan Leffler Oct 06 '14 at 07:52
  • I can't understand why "array of long" while your sieve requires only two kind of info - "yes" or "no". I would use i.e. uint8_t - you can store here 8bits of information. – soerium Oct 06 '14 at 07:55
  • @JonathanLeffler I didn't say otherwise ... I was merely noting that the error you pointed out (and I didn't catch) is one of several of the same sort. I suspect that the OP started with `int` and changed it to `long` in an attempt to handle larger primes but didn't do a complete or careful job of it. – Jim Balter Oct 06 '14 at 08:16
  • `numPrimes` is both a pointer *pointing to nothing* (in main()) and a long int (elsewhere) – joop Oct 06 '14 at 08:21

3 Answers3

5

Let's make some observations:

  • As people have mentioned in the comments and answers, you only need 1 bit to store whether a number is a prime. Therefore, you can pack the data for 8 numbers into 1 byte. Implement your own bitset data structure to make the code cleaner.
  • Also mentioned in the comments, since all prime numbers larger than 2 are odd numbers, there is no need to store data for even numbers. This allow you to pack 16 numbers into 1 byte.
  • Following the idea of wheel factorization, you can further pack 24 numbers into 1 bytes, if you store only the bits for numbers congruent to 1 or 5 modulo 6. Higher modulo will save slightly more space (with diminishing return), but the logic to access the bitset data structure may slow down the algorithm.
  • For the program to work with 1 trillion (1012) numbers, as you sieve, you only need to collect the list of prime numbers less than 1 million (106). Larger numbers are unnecessary, since the other factor in the compound numbers less than 1 trillion would already have been covered by the list.
  • Packing numbers at 16 numbers/byte (the most basic setup you should do), you would require ~58 GiB of RAM. However, there is no need to allocate for the whole range. Just allocate for a smaller range of a few hundred million to a few billion numbers (try to vary the number for highest performance), which translate to at most a few hundred MiB, then use the list of prime number less than 1 million to sieve the ranges.

    This step can be done in parallel - you only need to share the list of prime numbers less than 1 million.

    By the way, this technique is called segmented sieve.

nhahtdh
  • 52,949
  • 15
  • 113
  • 149
  • Care to explain the 4th step? As I understand the code in the question, it doesn't try to find all prime *factors* < 10^12 but all primes (so he's really looking for 11 digit primes). – Aaron Digulla Oct 06 '14 at 10:19
  • @AaronDigulla: Yes. If you implement an efficient sieve, you will start marking from from p^2, since 1, 2, ..., (p-1) in p*1, p*2, ..., p*(p-1) have been covered in earlier sieving for prime numbers less than p. So we will do 2 steps here: 1) Find all primes to sqrt(max) 2) Do the sieving (after sieving, just loop through the bitset to obtain the prime numbers). The 2nd step is explained in the 5th bullet. – nhahtdh Oct 06 '14 at 11:06
  • So this doesn't even explain why the code is crashing, yet it's accepted as the answer? How bizarre. – Jim Balter Oct 06 '14 at 18:52
  • @JimBalter: I answer the question on how to find primes under 1 trillion, which is the final goal of the OP. – nhahtdh Oct 07 '14 at 04:30
3
long *numPrimes;
...
primes = primesToSS(ceiling+1, numPrimes);
printf("\n\nThere are %d primes below %d.\n\n",*numPrimes,ceiling);

As Blastfurnace noted, you're dereferencing (and storing via) an uninitialized pointer. This should be

long numPrimes;
...
primes = primesToSS(ceiling+1, &numPrimes);
printf("\n\nThere are %d primes below %d.\n\n", numPrimes, ceiling);

There may well be other errors ... especially not checking for malloc failure. Oh, here's one:

long *sieve;
initArray(sieve, n, 1);

...

void initArray( int* arr, int len, int n ) {
    for( int i = 0; i < len; ++i) arr[i] = n; }

int and long are different types. And here you do it again:

long *primes = malloc(numPrimes*sizeof(int));

Best practice is to use sizeof the element, not the type, which avoids this sort of thing:

long *primes = malloc(numPrimes * sizeof *primes);

Check your code for other obvious mistakes.

Style note: putting a } at the end of a line is poor practice and begging for a bug when you add another line later.

Jim Balter
  • 15,014
  • 3
  • 38
  • 62
2

Some memory considerations: For the first 1 trillion primes, you need 1 trillion/8 bytes if you represent every number with a single bit. That's 125000000000 = 119209 MB = 116 GB of RAM.

While you could ask your OS to create a 200GB SWAP partition, the performance will probably be bad. If you want to keep the same algorithm, then you need to look at memory mapped files.

That will still be very slow but at least you'll be able to allocate an array which is big enough.

The next step will then be to look at different prime algorithms or you need to research memory compression algorithms. In your case, sparse matrices won't work well since your start matrix is all 1 and you eventually remove almost all elements.

Aaron Digulla
  • 297,790
  • 101
  • 558
  • 777
  • 2
    The OP's program is crashing, which this doesn't address. While it is an answer to the question in the first sentence, that doesn't seem to be the OP's actual focus. P.S. The Sieve of Atkin is a better algorithm for addressing memory issues. – Jim Balter Oct 06 '14 at 08:09
  • @JimBalter: Other answers already explained the bugs. I tried to answer the more general issues that OP should be aware of. – Aaron Digulla Oct 06 '14 at 10:14