The short and sweet:
rsqrtss
is safer and, as a result, less accurate and slower.
sqrtss
is faster and, as a result, less safe.
Why is rsqrtss
safer?
- It doesn't use the whole XMM register.
Why is rsqrtss
slower?
- Because it needs more registers to perform the same action as
sqrtss
.
Why does rsqrtss
use a reciprocal?
- In a pinch, it seems that the reciprocal of a square root can be calculated faster and with less memory.
Pico-spelenda: Lots of math.
The long and bitter:
Research
What does -ffast-math
do?
-ffast-math
Enable fast-math mode. This defines the __FAST_MATH__ preprocessor
macro, and lets the compiler make aggressive, potentially-lossy
assumptions about floating-point math. These include:
Floating-point math obeys regular algebraic rules for real numbers (e.g. + and * are associative, x/y == x * (1/y), and (a + b) * c == a * c + b * c),
operands to floating-point operations are not equal to NaN and Inf, and
+0 and -0 are interchangeable.
What does -fstack-protector-all
do?
This answer can be found here.
Basically, it "forces the usage of stack protectors for all functions".
What is a "stack protector"?
A nice article for you.
The blissfully short, quite terribly succient sparknotes is:
- A "stack protector" is used to prevent exploitation of stack overwrites.
the stack protector as implemented in gcc and clang adds an additional guard
variable to each function’s stack area.
Interesting Drawback To Note:
"Adding these checks will lead to a little runtime overhead: More stack
space is needed, but that is negligible except for really constrained
systems...Do you aim for maximum security at the cost of
performance? -fstack-protector-all
is for you."
What is sqrtss
?
According to @godbolt:
Computes the square root of the low single-precision floating-point value
in the second source operand and stores the single-precision floating-point
result in the destination operand. The second source operand can be an XMM
register or a 32-bit memory location. The first source and destination
operands is an XMM register.
What is a "source operand"?
A tutorial can be found here
In essence, an operand is a location of data in a computer. Imagine the simple instruction of x+x=y.You need to know what 'x' is, which is the source operand. And where the result will be stored, 'y', which is the destination operand. Notice how the '+' symbol, which is commonly called an 'operation' can be forgotten, because it doesn't matter in this example.
What is an "XMM register"?
An explanation can be found here.
It's just a specific type of register. It's primarily used in floating math
( which, surpisingly enough, is the math you are trying to do ).
What is rsqrtss
?
Again, according to @godbolt:
Computes an approximate reciprocal of the square root of the low
single-precision floating-point value in the source operand (second operand)
stores the single-precision floating-point result in the destination operand.
The source operand can be an XMM register or a 32-bit memory location. The
destination operand is an XMM register. The three high-order doublewords of
the destination operand remain unchanged. See Figure 10-6 in the Intel® 64 and
IA-32 Architectures Software Developer’s Manual, Volume 1, for an illustration
of a scalar single-precision floating-point operation.
What is a "doubleword"?
A simple definition.
It is a unit of measurement of computer memory, just like 'bit' or 'byte'. However, unlike 'bit' or 'byte', it is not universal and depends on the architectures of the computer.
What does "Figure 10-6 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1" look like?
Disclaimer:
Most of this knowlegde comes from outside sources. I literally install clang just now to help answer your question. I'm not an expert.