Questions tagged [fast-math]

The `-ffast-math` (or a similarly-named) compiler option trading precision and adherence to the IEEE 754 floating point standard in favor of execution speed

Most compilers have an option for turning floating-point-related optimizations which sacrifice computational precision and/or adherence to the costlier corner-cases of the IEEE 754 floating-point standard - in favor of better execution speed.

  • For gcc and clang, this option is named -ffast-math (and there are sub-options)
  • For nvcc, the name is --use-fast-math
  • For OpenCL compilation, the name is -cl-fast-relaxed-math

For more information: What does gcc's ffast-math actually do?

37 questions
2194
votes
12 answers

Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?

I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a*a, but the call pow(a,6) is not optimized and will actually call the library function pow,…
xis
  • 22,592
  • 8
  • 39
  • 55
167
votes
2 answers

What does gcc's ffast-math actually do?

I understand gcc's --ffast-math flag can greatly increase speed for float ops, and goes outside of IEEE standards, but I can't seem to find information on what is really happening when it's on. Can anyone please explain some of the details and maybe…
Ponml
  • 2,437
  • 4
  • 17
  • 17
29
votes
6 answers

Negative NaN is not a NaN?

While writing some test cases, and some of the tests check for the result of a NaN. I tried using std::isnan but the assert failes: Assertion `std::isnan(x)' failed. After printing the value of x, it turned out it's negative NaN (-nan) which is…
LiraNuna
  • 57,153
  • 14
  • 112
  • 136
16
votes
2 answers

Why does GCC or Clang not optimise reciprocal to 1 instruction when using fast-math

Does anyone know why GCC/Clang will not optimist function test1 in the below code sample to simply use just the RCPPS instruction when using the fast-math option? Is there another compiler flag that would generate this code? typedef float float4…
Chris_F
  • 3,780
  • 4
  • 27
  • 44
16
votes
2 answers

How do I compile with "ffast-math"?

I'm trying to benchmark some Rust code, but I can't figure out how to set the "ffast-math" option. % rustc -C opt-level=3 -C llvm-args='-enable-unsafe-fp-math' unrolled.rs rustc: Unknown command line argument '-enable-unsafe-fp-math'. Try: 'rustc…
yong
  • 3,333
  • 12
  • 26
12
votes
1 answer

Strict aliasing, -ffast-math and SSE

Consider the following program: #include #include #include #include using namespace std; int main() { // 4 float32s. __m128 nans; // Set them all to 0xffffffff which should be NaN. …
Timmmm
  • 68,359
  • 51
  • 283
  • 367
11
votes
1 answer

gcc, simd intrinsics and fast-math concepts

Hi all :) I'm trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I'm using MinGW with gcc v4.5.0 on a x86 cpu. I've searched around for a while now, and that's…
rocket441
  • 257
  • 3
  • 7
11
votes
2 answers

Is there a -ffast-math flag equivalent for the Visual Studio C++ compiler

I'm working with the default C++ compiler (I guess it's called the "Visual Studio C++ compiler") that comes with Visual Studio 2013 with the flag /Ox (Full Optimization). Due to floating point side effects, I must disable the -ffast-math flag when…
Matthias
  • 3,833
  • 11
  • 36
  • 74
10
votes
1 answer

Good sentinel value for double if prefer to use -ffast-math

Since the gcc option -ffast-math effectively disables NaN and -/+inf, I'm looking for maybe the next best option for representing NaN in my performance-critical math code. Ideally the sentinel value if operated on (add, mul, div, sub, etc..) would…
stgtscc
  • 812
  • 1
  • 6
  • 18
9
votes
1 answer

std::isinf does not work with -ffast-math. how to check for infinity

Sample code: #include #include #include using namespace std; static bool my_isnan(double val) { union { double f; uint64_t x; } u = { val }; return (u.x << 1) > 0x7ff0000000000000u; } int main() { cout <<…
Albert
  • 57,395
  • 54
  • 209
  • 347
8
votes
3 answers

Auto vectorization on double and ffast-math

Why is it mandatory to use -ffast-math with g++ to achieve the vectorization of loops using doubles? I don't like -ffast-math because I don't want to lose precision.
Ruggero Turra
  • 14,523
  • 14
  • 72
  • 123
7
votes
2 answers

Can I make my compiler use fast-math on a per-function basis?

Suppose I have template void foo(float* data, size_t length); and I want to compile one instantiation with -ffast-math (--use-fast-math for nvcc), and the other instantiation without it. This can be achieved by instantiating…
einpoklum
  • 86,754
  • 39
  • 223
  • 453
7
votes
1 answer

gcc -Ofast - complete list of limitations

I'm using -Ofast gcc option in my program cause latency requirements. I wrote simple test program: #include #include static double quiet_NaN = std::numeric_limits::quiet_NaN(); int main() { double newValue = 130000;…
Oleg Vazhnev
  • 21,122
  • 47
  • 154
  • 286
6
votes
1 answer

Do denormal flags like Denormals-Are-Zero (DAZ) affect comparisons for equality?

If I have 2 denormal floating point numbers with different bit patterns and compare them for equality, can the result be affected by the Denormals-Are-Zero flag, the Flush-to-Zero flag, or other flags on commonly used processors? Or do these flags…
Zachary Burns
  • 291
  • 1
  • 7
6
votes
2 answers

Does any floating point-intensive code produce bit-exact results in any x86-based architecture?

I would like to know if any code in C or C++ using floating point arithmetic would produce bit exact results in any x86 based architecture, regardless of the complexity of the code. To my knowledge, any x86 architecture since the Intel 8087 uses a…
Samuel Navarro Lou
  • 1,048
  • 6
  • 15
1
2 3