Questions tagged [floating-point]

Floating point numbers are approximations of real numbers that can represent larger ranges than integers but use the same amount of memory, at the cost of lower precision. If your question is about small arithmetic errors (e.g. why does 0.2 + 0.1 equal 0.300000001?) or decimal conversion errors, please read the "info" page linked below before posting.

Many questions asked here about floating point math are about small inaccuracies in floating point arithmetic. To use the example from the excerpt, 0.1 + 0.1 + 0.1 might result in 0.300000001 instead of the expected 0.3. Errors like these are caused by the way floating point numbers are represented in computers' memory.

Integers are stored as exact values of the numbers they represent. Floating point numbers are stored as two values: a significand and an exponent. It is not possible to find a significand-exponent pair that matches every possible real number. As a result, some approximation and therefore inaccuracy is unavoidable.

Two commonly cited introductory-level resources about floating point math are What Every Computer Scientist Should Know About Floating-Point Arithmetic and the floating-point-gui.de.

FAQs:

Why 0.1 does not exist in floating point

Floating Point Math at https://0.30000000000000004.com/

Related tags:

Programming languages where all numbers are double-precision (64b) floats:

13427 questions
8
votes
3 answers

How do I find the largest integer less than x?

If x is 2.3, then math.floor(x) returns 2.0, the largest integer smaller than or equal to x (as a float.) How would I get i the largest integer strictly smaller than x (as a integer)? The best I came up with is: i = int(math.ceil(x)-1) Is there a…
pheon
  • 2,255
  • 2
  • 22
  • 32
8
votes
3 answers

PHP: number_format rounding

Hi I've been having a problem rounding numbers to -0 instead of just a 0 code: output: -0 expected output: 0 I've been looking to any solution but…
sa.lva.ge
  • 513
  • 1
  • 5
  • 13
8
votes
4 answers

How do calculators work with precision?

I wonder how calculators work with precision. For example the value of sin(M_PI) is not exactly zero when computed in double precision: #include #include int main() { double x = sin(M_PI); printf("%.20f\n", x); //…
zoul
  • 96,282
  • 41
  • 242
  • 342
8
votes
2 answers

How does Double.isNaN() work?

The sun jdk implementation looks like this: return v != v; Can anyone explain how that works?
whiskeysierra
  • 4,761
  • 1
  • 25
  • 36
8
votes
2 answers

Maximum float value in php

Is there a way to programmatically retrieve the maximum float value for php. Akin to FLT_MAX or std::numeric_limits< float >::max() in C / C++? I am using something like the following: $minimumCost = MAXIMUM_FLOAT_VALUE??; foreach ( $objects as…
Alex Deem
  • 4,547
  • 1
  • 19
  • 21
8
votes
5 answers

Floating point precision in Visual C++

HI, I am trying to use the robust predicates for computational geometry from Jonathan Richard Shewchuk. I am not a programmer, so I am not even sure of what I am saying, I may be doing some basic mistake. The point is the predicates should allow…
user240092
8
votes
5 answers

Infinity in MSVC++

I'm using MSVC++, and I want to use the special value INFINITY in my code. What's the byte pattern or constant to use in MSVC++ for infinity? Why does 1.0f/0.0f appear to have the value 0? #include #include int main() { float…
bobobobo
  • 57,855
  • 58
  • 238
  • 337
8
votes
3 answers

What does fpstrict do in Java?

I read the JVM specification for the fpstrict modifier but still don't fully understand what it means. Can anyone enlighten me?
Yuval Adam
  • 149,388
  • 85
  • 287
  • 384
8
votes
3 answers

Are floating point operations in Delphi deterministic?

Are floating point operations in Delphi deterministic? I.E. will I get the same result from an identical floating point mathematical operation on the same executable compiled with Delphi Win32 compiler as I would with the Win64 compiler, or the OS X…
LaKraven
  • 5,676
  • 2
  • 21
  • 49
8
votes
3 answers

__builtin_round is not a constant expression

In G++, various builtin math functions are constexpr under certain conditions. For example, the following compiles: static constexpr double A = __builtin_sqrt(16.0); static constexpr double B = __builtin_pow(A, 2.0); They are not always constexpr…
Ambroz Bizjak
  • 7,399
  • 1
  • 34
  • 44
8
votes
3 answers

MySQL "greater than" condition sometimes returns row with equal value

I'm running into a baffling issue with a basic MySQL query. This is my table: id | rating 1 | 1317.17 2 | 1280.59 3 | 995.12 4 | 973.88 Now, I'm attempting to find all rows where the rating column is larger than a certain value. If I try the…
8
votes
6 answers

Convert float to string without sprintf()

I'm coding for a microcontroller-based application and I need to convert a float to a character string, but I do not need the heavy overhead associated with sprintf(). Is there any eloquent way to do this? I don't need too much. I only need 2 digits…
audiFanatic
  • 2,014
  • 5
  • 27
  • 50
8
votes
4 answers

Hex Representation of Floats in Haskell

I want to convert a Haskell Float to a String that contains the 32-bit hexadecimal representation of the float in standard IEEE format. I can't seem to find a package that will do this for me. Does anybody know of one? I've noticed that GHC.Float…
Jeremy
  • 81
  • 2
8
votes
2 answers

PostgreSQL round(v numeric, s int)

Which method does Postgres round(v numeric, s int) use? Round half up Round half down Round half away from zero Round half towards zero Round half to even Round half to odd I'm looking for documentation reference.
mpapec
  • 48,918
  • 8
  • 61
  • 112
8
votes
4 answers

Conditional tests in primality by trial division

My question is about the conditional test in trial division. There seems to be some debate on what conditional test to employ. Let's look at the code for this from RosettaCode. int is_prime(unsigned int n) { unsigned int p; if (!(n & 1)…
Z boson
  • 29,230
  • 10
  • 105
  • 195
1 2 3
99
100