Anything related to double-precision floating-point arithmetic and data-types. Often used with reference to IEEE754 double-precision floating-point representation.
Questions tagged [double-precision]
243 questions
5
votes
6 answers
Best base type to deal with linear algebra
I'm writing a small and inadequate linear algebra library in C++ for a project (I'm sorry). I'm implementing matrices and operations using double precision numbers. I'm doing right? Should I implement a template class instead? Is there a more…
![](../../users/profiles/25418.webp)
tunnuz
- 21,380
- 29
- 86
- 124
5
votes
2 answers
Does Fortran have inherent limitations on numerical accuracy compared to other languages?
While working on a simple programming exercise, I produced a while loop (DO loop in Fortran) that was meant to exit when a real variable had reached a precise value.
I noticed that due to the precision being used, the equality was never met and the…
![](../../users/profiles/379360.webp)
EMiller
- 2,669
- 3
- 29
- 52
5
votes
1 answer
Confusion about kinds in FORTRAN
I have been in the process of writing a FORTRAN code for numerical simulations of an applied physics problem for more than two years and I've tried to follow the conventions described in Fortran Best Practices.
More specifically, I defined a…
![](../../users/profiles/3229162.webp)
Toon
- 187
- 10
5
votes
4 answers
new BigDecimal(double) vs new BigDecimal(String)
When BigDecimal is used with an input of double and BigDecimal with an input of String different results seem to appear.
BigDecimal a = new BigDecimal(0.333333333);
BigDecimal b = new BigDecimal(0.666666666);
BigDecimal c = new…
![](../../users/profiles/4788242.webp)
dardeshna
- 199
- 1
- 7
5
votes
2 answers
Is integer multiplication implemented using double precision floating point exact up until 2^53?
I ask because I am computing matrix multiplications where all the matrix values are integers.
I'd like to use LAPACK so that I get fast code that is correct. Will two large integers (whose product is less than 2^53), stored as doubles, when…
![](../../users/profiles/340947.webp)
Steven Lu
- 36,733
- 50
- 179
- 328
4
votes
1 answer
Output precision is higher than double precision
I am printing some data from a C++ program to be processed/visualized by ParaView, but I am having a problem with floating point numbers. Paraview supports both Float32 and Float64 data types. Float64 is equivalent to double with the typical limits …
![](../../users/profiles/505047.webp)
iluvatar
- 758
- 7
- 18
4
votes
4 answers
C++ writing and reading doubles from a binary file
I want to perform disk I/O operations for a program that takes too much RAM.
I use matrices of doubles and think writing them to disk as bytes is the fastest way (I need to preserve the double precision).
How to do it with portability?
I found this…
![](../../users/profiles/819328.webp)
Andy
- 388
- 1
- 5
- 16
4
votes
2 answers
What does the "double" do in ceil(double)?
I have a number (let's say, 34), and I want to find its next multiple of ten. I can do this by:
Dividing the number by 10
Rounding it up to a whole number
Multiplying by 10.
After a bit of research, I discovered that this is the code for that in…
![](../../users/profiles/619432.webp)
Ric Levy
- 956
- 1
- 15
- 33
4
votes
2 answers
MATLAB: Converting a uint32 (4-byte) value to the corresponding IEEE single-precision floating-point form
In MATLAB (r2009b) I have a uint32 variable containing the value 2147484101.
This number (its 4-bytes) has been extracted from a digital machine-vision camera in a grabbing process. According to what I understand it holds the single-precision form…
![](../../users/profiles/582867.webp)
Ole Thomsen Buus
- 1,323
- 1
- 9
- 23
4
votes
1 answer
NSDate and double precision problem
Here is the code
NSDate* d = [NSDate dateWithTimeIntervalSince1970:32.4560];
double ti = [d timeIntervalSince1970];
NSLog(@"Interval: %f %f %f %f",ti,32.4560,ti*1000.0,32.4560*1000.0);
the output is
Interval: 32.456000 32.456000 32455.999970…
![](../../users/profiles/41803.webp)
teerapap
- 4,813
- 7
- 30
- 39
4
votes
2 answers
Can std::uniform_real_distribution(0,1) return a value greater than 0.99999999999999994?
From the C++11 header , I was wondering if a std::uniform_real_distribution object can spit out a double that's greater than 0.99999999999999994? If so, multiplying this value by 2 would equal 2.
Example:
std::default_random_engine…
![](../../users/profiles/2998230.webp)
starpax
- 130
- 5
4
votes
1 answer
Unexpected error in Julia set rendering
I am playing with Mandelbrot and Julia sets and I encountered interesting problem. The Mandelbrot set can be rendered in double precision until zooms of around 2^56 at any place. However, the Julia set sometimes produces artifacts much sooner like…
![](../../users/profiles/1030376.webp)
NightElfik
- 3,834
- 4
- 22
- 34
4
votes
1 answer
GLSL Double Precision Angle, Trig and Exponential Functions Workaround
In GLSL there's rudimentary support for double precision variables and operations which can be found here. However they also mention "Double-precision versions of angle, trigonometry, and exponential
functions are not supported.".
Is there a…
![](../../users/profiles/3046645.webp)
Benjamin Cecchetto
- 55
- 7
4
votes
1 answer
GCC What's the right inline assembly constraint to operate with ARM VFP instructions?
I want to load the value of a double precision register (d8) into a C variable on ARM platform with a toolchain (gcc-4.6) that comes with the Google NDKv8b. My ARM machine is a Samsung Galaxy S2 (it has VFPv3 and NEON). The GCC documentation says…
![](../../users/profiles/992414.webp)
Juan Gómez
- 93
- 1
- 9
4
votes
2 answers
How to know if a double string is round-trip safe?
I have a text representation of a double and want to know if it's safe to round-trip it to double and back. How do I know this if I also want to accept any kind of number-style of the input? Or how do I know if any precision is lost when a…
![](../../users/profiles/382838.webp)
Andreas Zita
- 6,150
- 4
- 39
- 104