17

Is there an easy way to determine the number of digits a GMP integer has? I know you can determine it through a log, but I was wondering if there was something built into the library that I'm missing. The only thing I've found in the manual is:

_mp_size The number of limbs, or the negative of that when representing a negative integer. Zero is represented by _mp_size set to zero, in which case the _mp_d data is unused.

But I'm under the impression that is quite different than what I'm looking for.

i.e

124839 = 6 digits.

Darkenor
  • 4,097
  • 8
  • 35
  • 65

1 Answers1

17

You can use size_t mpz_sizeinbase (mpz_t op, int base) to get the number of characters to output the number as a string in a specific base.

size_t mpz_sizeinbase (mpz_t op, int base)

Return the size of op measured in number of digits in the given base. base can vary from 2 to 62. The sign of op is ignored, just the absolute value is used. The result will be either exact or 1 too big. If base is a power of 2, the result is always exact. If op is zero the return value is always 1.

This function can be used to determine the space required when converting op to a string. The right amount of allocation is normally two more than the value returned by mpz_sizeinbase, one extra for a minus sign and one for the null-terminator.

So something along the lines of:

size_t sz = mpz_sizeinbase (myNum, 10);

should be a good start.

If you want the exact size, you can use that value to create a big enough buffer, output the value to that buffer, then do a strlen to get the more accurate size, something like:

size_t sz = mpz_sizeinbase (myNum, 10) + 1; // allow for sign
char *buff = malloc (sz + 1);               // allow for `\0`
if (buff != NULL) {
    gmp_sprintf (buff, "%Zd", myNum);
    sz = strlen (buff);
    free (buff);
}

Note that it's not the most efficient way since it allocates a buffer every time you want to find the length, and it defaults to the safest size if the allocation fails, which could be one larger than necessary.

Another possible way is to use the safer snprintf option, since that returns the number of bytes that would have been written, and prevents buffer overflow:

char oneChar;
int sz = gmp_snprintf (&oneChar, 1, "%Zd", myNum);

I haven't tested that specifically but it's a trick I've used for "regular" C-style printing before.

Note that both those "exact size" solutions include an optional sign at the front. If you want to truly count the digits rather then the characters, you should adjust for that (subtracting one from the size if the number is less than zero, for example).

Community
  • 1
  • 1
paxdiablo
  • 772,407
  • 210
  • 1,477
  • 1,841
  • 1
    Thanks, paxdiablo. Kind of unfortunate that GMP doesn't have a way of getting the exact answer. I was able to use it to get the answer to my problem, but in a sort of ghetto way. Perhaps I could use it as a way to test if I'm "almost" there and then convert the result to a string or stream and take the length of it? Seems rather inelegant, but that is the only thing I can think of doing. – Darkenor Feb 22 '11 at 13:22
  • Yeah, it's mostly used for just figuring out how big a string needs to be, so the possibility of being one too many isn't so bad. From the source code, they said you could do a statistical analysis on the first few limbs (either 10000.. or 99999..) but they deemed it unnecessary for the purposes it was designed for. Perhaps you should just bite the bullet, allocate space based on mpz_sizeinbase, then print the value to it and do strlen. – paxdiablo Feb 22 '11 at 13:53
  • 3
    ***Note the sentence:*** `The result will be either exact or 1 too big`. Translation, you cannot trust this value. I have been burned by the *off by one error* inherent in this function... – recursion.ninja Sep 21 '13 at 16:35
  • 1
    If you've been burnt by this function, it's not because the _function_ is wrong, it's because you misunderstood (or didn't bother to read) the docs :-) It's does not _have_ an "off by one error" inherent because it explicitly states that's what it gives you (a bug is when the function doesn't perform as described, anything else is an implementation choice), and you _can_ trust this value if you use it for its intended purpose. If you really want the _exact_ size, use `sizeinbase()+2` to allocate enough space, output the value into that buffer, and do `strlen()`. – paxdiablo Jan 26 '15 at 01:05
  • @Kundor, the one is a limit of bytes written, *including* the terminator, so I believe the code is correct as is. This is also true for `snprintf`. The fact that you're getting a core dump for both situations supports this - it's almost certainly something wrong with your code. If you'd like to post a question with the code, you'll get more help than by leaving a (necessarily truncated) comment. – paxdiablo Apr 28 '16 at 01:38
  • @paxdiablo: You're right, I did something stupid. (You might even say embarrassingly boneheaded: I named the character `char c`, thus hiding the `mpz_t c` from an outer scope, then did `gmp_snprintf(&c, 1, "%Zd", c)`...) – Nick Matteo Apr 28 '16 at 02:03
  • @Kundor, if you're not making one serious boneheaded move a year, your career has stagnated. Suffice to say my long history in IT is *far* from stagnant :-) – paxdiablo Apr 28 '16 at 02:11