5

EDIT: After some discussion in the comments it came out that because of a luck of knowledge in how floating point numbers are implemented in C, I asked something different from what I meant to ask.
I wanted to use (do operations with) integers larger than those I can have with unsigned long long (that for me is 8 bytes), possibly without recurring to arrays or bigint libraries. Since my long double is 16 bytes, I thought it could've been possible by just switching type. It came out that even though it is possible to represent larger integers, you can't do operations -with these larger long double integers- without losing precision. So it's not possible to achieve what I wanted to do. Actually, as stated in the comments, it is not possible for me. But in general, wether it is possible or not depends on the floating point characteristics of your long double.

// end of EDIT

I am trying to understand what's the largest integer that I can store in a long double.
I know it depends on environment which the program is built in, but I don't know exactly how. I have a sizeof(long double) == 16 for what is worth.

Now in this answer they say that the the maximum value for a 64-bit double should be 2^53, which is around 9 x 10^15, and exactly 9007199254740992.
When I run the following program, it just works:

#include <stdio.h>

int main() {

    long double d = 9007199254740992.0L, i;
    
    printf("%Lf\n", d);
    
    for(i = -3.0; i < 4.0; i++) {
        
        printf("%.Lf) %.1Lf\n", i, d+i);
    }
    
    return 0;
}

It works even with 11119007199254740992.0L that is the same number with four 1s added at the start. But when I add one more 1, the first printf works as expected, while all the others show the same number of the first print.
So I tried to get the largest value of my long double with this program

#include <stdio.h>
#include <math.h>

int main() {

    long double d = 11119007199254740992.0L, i;
    
    for(i = 0.0L; d+i == d+i-1.0; i++) {
        
        if( !fmodl(i, 10000.0L) ) printf("%Lf\n", i);
    }
    
    printf("%.Lf\n", i);
    
    return 0;
}

But it prints 0.
(Edit: I just realized that I needed the condition != in the for)

Always in the same answer, they say that the largest possible value of a double is DBL_MAX or approximately 1.8 x 10^308.
I have no idea of what does it mean, but if I run

printf("%e\n", LDBL_MAX);

I get every time a different value that is always around 6.9 x 10^(-310).
(Edit: I should have used %Le, getting as output a value around 1.19 x 10^4932)
I took LDBL_MAX from here.

I also tried this one

printf("%d\n", LDBL_MAX_10_EXP);

That gives the value 4932 (which I also found in this C++ question).

Since we have 16 bytes for a long double, even if all of them were for the integer part of the type, we would be able to store numbers till 2^128, that is around 3.4 x 10^38. So I don't get what 308, -310 and 4932 are supposed to mean.

Is someone able to tell me how can I find out what's the largest integer that I can store as long double?

Sheik Yerbouti
  • 1,276
  • 2
  • 13
  • 1
    That depends on whether the integer has any `2` factors, which will be taken up by the exponent. The simple answer is the number of specified signifcand bits (plus 1) for loss-less storage of an integer value. The exponent dictates the range of all values that can be approximately represented. – Weather Vane Jan 15 '21 at 16:48
  • 2
    In `printf("%e\n", LDBL_MAX);` did you mean to use `%Le`? – Weather Vane Jan 15 '21 at 16:59
  • @WeatherVane oh you are right! It should be `%Le`. The output is now around 1.19 x 10^4932. But again, 4932 is too big for 16 bytes, what does it mean? – Sheik Yerbouti Jan 15 '21 at 17:04
  • 1
    There are several ways to interpret the question, all of which yield different results. For instance, *all* of the largest values that a `long double` can store are integers, and in particular, `LDBL_MAX` is an integer. But there is no built-in integer integer data type that can represent that value in most implementations. – John Bollinger Jan 15 '21 at 17:06
  • 1
    The `4932` isn't *directly* related to the 16 bytes of the floating point type, but `10^4932` comes from the number of bits in the exponent part. – Weather Vane Jan 15 '21 at 17:06
  • 1
    Perhaps instead you are looking for the largest value representable as a (say) `unsigned long long int` that is also exactly representable as a `long double`. Or perhaps you are looking for the maximum of the range of contiguous integer values, starting at 0, that can be represented by a `long double`. Again, all of these are different. You'll need to tell us which you actually want. – John Bollinger Jan 15 '21 at 17:10
  • Please see [Quadruple-precision floating-point format](https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format) – Weather Vane Jan 15 '21 at 17:13
  • What I was looking for was to use integers larger than those I can have with `long long`, without recurring to arrays or bigint libraries. Since my `long long` is 8 bytes while my `long double` is 16, I thought it could've been possible by just switching type. I don't need to use any decimal number, just the integer part of the `long double` – Sheik Yerbouti Jan 15 '21 at 17:16
  • `LDBL_MAX` should be printed with `printf("%Le\n", LDBL_MAX);`, not `printf("%e\n", LDBL_MAX);`. If your compiler is not warning you about that, enable more warning options. If it still does not warn you about it, do not use that compiler as a tool for learning about C. – Eric Postpischil Jan 15 '21 at 17:16
  • @EricPostpischil I am compiling from the phone, it is a good enough tool xD – Sheik Yerbouti Jan 15 '21 at 17:17
  • Note that 4932 ≈ 2^14 * log 2. There are 15 bits of exponent (one for its sign). – Weather Vane Jan 15 '21 at 17:18
  • 1
    You asked the wrong question. In any floating-point format, all values represented above b^p are integers, where b is the base used in the format and p is the precision (the number of digits in the significand). This is because multiplying a significand (d.ddd…ddd) by b^p effectively shifts the last digit to be left of the decimal point, and therefore there are no digits representing a fractional part, so the value represented must be an integer. What you actually meant to ask is what is the end of the consecutively representable integers—where does the first non-representable integer occur. – Eric Postpischil Jan 15 '21 at 17:20
  • 1
    The first non-representable integer is b^p+1. For IEEE-754 binary64, b is 2 and p is 53, and that is why you heard about 2^53. In binary64, all integers in [−2^53, +2^53] are representable, and 2^53+1 is not representable. Intel’s 80-bit floating-point format has 64-bit significand, so its range in which all integers are representable is [−2^64, 2^64]. However, cannot just “switch type.” Divisions will produce different results, and the compiler will complain about bit operators and the remainder operator. – Eric Postpischil Jan 15 '21 at 17:24
  • @EricPostpischil whoa thanks for the lesson. So the largest integer should be 2^64, but I can't just use `long double` instead of integers, even if the operations are between two `long doubles` – Sheik Yerbouti Jan 15 '21 at 17:28
  • 2
    "So it's not possible to achieve what I wanted to do." --> is incorrect. It depends on the floating point characteristics of `long double`. With `LDBL_MANT_DIG == 64`, code can effect `int65_t` like operations. – chux - Reinstate Monica Jan 15 '21 at 19:01
  • "I wanted to use (do operations with) integers larger than those I can have" --> perhaps best to ask a question directly to that goal rather than "how to I use a FP to achieve the (unstated) operations?" – chux - Reinstate Monica Jan 15 '21 at 19:05
  • @chux-ReinstateMonica oh thank you for the clarification, I am going to re-edit. About asking for the goal: I thought I isolated my problem to finding out what's the biggest integer I could have in `long double`. I had no idea that I couldn't use that largest integer and that what I actually needed to find out was what's the largest integer I can have and use as `long double` – Sheik Yerbouti Jan 15 '21 at 19:13

3 Answers3

4

Inasmuch as you express in comments that you want to use long double as a substitute for long long to obtain increased range, I assume that you also require unit precision. Thus, you are asking for the largest number representable by the available number of mantissa digits (LDBL_MANT_DIG) in the radix of the floating-point representation (FLT_RADIX). In the very likely event that FLT_RADIX == 2, you can compute that value like so:

#include <float.h>
#include <math.h>

long double get_max_integer_equivalent() {
    long double max_bit = ldexpl(1, LDBL_MANT_DIG - 1);
    return max_bit + (max_bit - 1);
}

The ldexp family of functions scale floating-point values by powers of 2, analogous to what the bit-shift operators (<< and >>) do for integers, so the above is similar to

// not reliable for the purpose!
unsigned long long max_bit = 1ULL << (DBL_MANT_DIG - 1);
return max_bit + (max_bit - 1);

Inasmuch as you suppose that your long double provides more mantissa digits than your long long has value bits, however, you must assume that bit shifting would overflow.

There are, of course, much larger values that your long double can express, all of them integers. But they do not have unit precision, and thus the behavior of your long double will diverge from the expected behavior of integers when its values are larger. For example, if long double variable d contains a larger value then at least one of d + 1 == d and d - 1 == d will likely evaluate to true.

John Bollinger
  • 121,924
  • 8
  • 64
  • 118
  • I run your function and it returns a long number that is 2^64 - 1. I will take my time to understand how the function works, but anyway I get the I can't use `long double` to manipulate integers larger than those I can manipulate with `unsigned long long`. Thank you for giving me a piece of code proving this fact. – Sheik Yerbouti Jan 15 '21 at 18:16
  • "But they do not have unit precision" --> Yet `get_max_integer_equivalent() +1.0` is representable. – chux - Reinstate Monica Jan 15 '21 at 18:59
  • Yes indeed, @chux-ReinstateMonica. It is the smallest `long double` that does not have unit precision. That is, the least in which the least-significant digit has place value greater than 1. The last sentence of this answer applies to it. And if it turns out that adding 1 to that number yields a different number, then the result will not be the next larger integer. – John Bollinger Jan 15 '21 at 19:29
  • As I see it, the `long double` in question can encode all the integers `INT65_MIN...INT65_MAX+1` or up to `UINT64_MAX+1`. – chux - Reinstate Monica Jan 15 '21 at 20:15
  • @chux-ReinstateMonica, it is a question, as is often the case, of how you want to use the data. If you only want values that you can rely upon to behave arithmetically the same as integers, as I explicitly take to be the OP's case, then `get_max_integer_equivalent() +1.0` is not such a number. This is in fact precisely why I expressed the function is I did, as opposed to using the simpler, but unreliable `return ldexpl(1, LDBL_MANT_DIG) - 1;`. – John Bollinger Jan 15 '21 at 20:24
1

You can print the maximum value on your machine using limits.h, the value is ULLONG_MAX

In https://www.geeksforgeeks.org/climits-limits-h-cc/ is a C++ example.

The format specifier for printing unsigned long long with printf() is %llu for printing long double it is %Lf

printf("unsigned long long int: %llu ",(unsigned long long) ULLONG_MAX);

printf("long double: %Lf ",(long double) LDBL_MAX);

https://www.tutorialspoint.com/format-specifiers-in-c

Is also in Printing unsigned long long int Value Type Returns Strange Results

ralf htp
  • 8,134
  • 3
  • 19
  • 30
  • 1
    And if `unsigned long long` is 128 bits, `ULLONG_MAX` simply can't fit into a 128-bit floating point value without loss of data. – Andrew Henle Jan 15 '21 at 16:56
  • Printing `ULLONG_MAX` gives me always a different value around 6.9 x 10^(-310). Like printing `LDBL_MAX`. However the two values are different between each other. – Sheik Yerbouti Jan 15 '21 at 17:00
  • `ULLONG_MAX` cannot be 6.9•10^−310 or near it because `ULLONG_MAX` is an integer and 6.9•10^−310 and values near it are not (except for zero). If you are getting something appearing to be near 6.9•10^−310 when you print `ULLONG_MAX`, then you are printing it incorrectly. – Eric Postpischil Jan 15 '21 at 17:12
  • edited answer, too long for comment – ralf htp Jan 15 '21 at 17:27
1

Assuming you mean "stored without loss of information", LDBL_MANT_DIG gives the number of bits used for the floating-point mantissa, so that's how many bits of an integer value that can be stored without loss of information.*

You'd need 128-bit integers to easily determine the maximum integer value that can be held in a 128-bit float, but this will at least emit the hex value (this assumes unsigned long long is 64 bits - you can use CHAR_BIT and sizeof( unsigned long long ) to get a portable answer):

#include <stdio.h>
#include <float.h>
#include <limits.h>


int main( int argc, char **argv )
{
    int tooBig = 0;
    unsigned long long shift = LDBL_MANT_DIG;
    if ( shift >= 64 )
    {
        tooBig = 1;
        shift -= 64;
    }

    unsigned long long max = ( 1ULL << shift ) - 1ULL;

    printf( "Max integer value: 0x" );

    // don't emit an extraneous zero if LDBL_MANT_DIG is
    // exactly 64
    if ( max )
    {
        printf( "%llx", max );
    }

    if ( tooBig )
    {
        printf( "%llx", ULLONG_MAX );
    }

    printf( "\n" );

    return( 0 );
}

* - pedantically, it's the number of digits in FLT_RADIX base, but that base is almost certainly 2.

Andrew Henle
  • 27,654
  • 3
  • 23
  • 49