44

I browsed through the recently released Doom 3 BFG source code, when I came upon something that does not appear to make any sense. Doom 3 wraps mathematical functions in the idMath class. Some of the functions just foward to the corresponding functions from math.h, but some are reimplementations (e.g. idMath::exp16()) that I assume have a higher performance than their math.h counterparts (maybe at the expense of precision).

What baffles me, however, is the way they have implemented the float idMath::Sqrt(float x) function:

ID_INLINE float idMath::InvSqrt( float x ) {
     return ( x > FLT_SMALLEST_NON_DENORMAL ) ? sqrtf( 1.0f / x ) : INFINITY;
}

ID_INLINE float idMath::Sqrt( float x ) {
     return ( x >= 0.0f ) ? x * InvSqrt( x ) : 0.0f;
}

This appears to perform two unnecessary floating point operations: First a division and then a multiplication.

It is interesting to note that the original Doom 3 source code also implemented the square root function in this way, but the inverse square root uses the fast inverse square root algorithm.

ID_INLINE float idMath::InvSqrt( float x ) {

    dword a = ((union _flint*)(&x))->i;
    union _flint seed;

    assert( initialized );

    double y = x * 0.5f;
    seed.i = (( ( (3*EXP_BIAS-1) - ( (a >> EXP_POS) & 0xFF) ) >> 1)<<EXP_POS) | iSqrt[(a >> (EXP_POS-LOOKUP_BITS)) & LOOKUP_MASK];
    double r = seed.f;
    r = r * ( 1.5f - r * r * y );
    r = r * ( 1.5f - r * r * y );
    return (float) r;
}


ID_INLINE float idMath::Sqrt( float x ) {
    return x * InvSqrt( x );
}

Do you see any advantage in calculating Sqrt(x) as x * InvSqrt(x) if InvSqrt(x) internally just calls math.h's fsqrt(1.f/x)? Am I maybe missing something important about denormalized floating point numbers here or is this just sloppiness on id software's part?

Puppy
  • 138,897
  • 33
  • 232
  • 446
Robert Rüger
  • 767
  • 8
  • 21
  • I suppose the whole advantage should have been in leveraging the famous fast inverse square root implementation. – Roman R. Nov 28 '12 at 12:30
  • 4
    Their way gives a different and less accurate result for denorms (since denorm * infinity is infinity, but the actual square root of a denorm value is a small value). Maybe they have code elsewhere that relies on this, requiring the new `Sqrt` to be backward compatible with the old, but they could still have dealt with that by a special case. – Steve Jessop Nov 28 '12 at 12:31
  • 6
    And when the fast invSqrt became obsolete, nobody bothered to update the normal square root function... – Mysticial Nov 28 '12 at 12:32
  • 1
    may be standard `sqrtf` is slow with denormalized floats? –  Nov 28 '12 at 12:42
  • @aleguna: maybe it is, but they could still check for denormalization within `Sqrt()` itself and return whatever they see fit. – Robert Rüger Nov 28 '12 at 12:57
  • 1
    I'd be curious to see what the x86 assembly is for both cases. – Nathan Ernst Nov 28 '12 at 22:42

4 Answers4

8

I can see two reasons for doing it this way: firstly, the "fast invSqrt" method (really Newton Raphson) is now the method used in a lot of hardware, so this approach leaves open the possibility of taking advantage of such hardware (and doing potentially four or more such operations at once). This article discusses it a little bit:

How slow (how many cycles) is calculating a square root?

The second reason is for compatibility. If you change the code path for calculating square roots, you may get different results (especially for zeroes, NaNs, etc.), and lose compatibility with code that depended on the old system.

Community
  • 1
  • 1
Benny Smith
  • 249
  • 1
  • 2
  • 8
  • I'm not sure I understand the first part of your answer. Do you say that todays hardware might be doing the fast inverse square root algorithm inside `InvSqrt()` even though it is not written out explicitly? `InvSqrt()` just calls `math.h`'s `sqrt()` internally, so I guess that would at least require support in the used C standard library implementation ... – Robert Rüger Dec 04 '12 at 21:54
  • 1
    I'm saying you could write an implementation of InvSqrt that uses the hardware on platforms where it's supported, and on other platforms you just use your default 1.0/sqrt. It would look something like this: #if defined(X86) // SSE2 implementation #elif defined(PSP) // Solution using their SIMD instructions #else // The original, default implementation #endif Sorry about the formatting, I can't figure out how to put code in the comments. – Benny Smith Dec 04 '12 at 22:08
  • Ah, I see! From the code they put in github it doesn't look like they did that though. – Robert Rüger Dec 04 '12 at 23:37
5

As far as I know, the InvSqrt is used to compute colors in the sense that color depends on the angle from which light bounces off a surface, which gives you some function using the inverse of the square root.

In their case, they don't need huge precision when computing these numbers, so the engineers behind Doom 3's code (originally from Quake III) came up with a very very very fast method of computing an approximation for InvSqrt using only several Newton-Raphson's iterations.

This is why they use InvSqrt in all their code, instead of using built-in (slower) functions. I guess the use of x * InvSqrt(x) is there to avoid multiplying work by two (by having two very efficient functions, one for InvSqrt and another for Sqrt).

You should read this article, it might shed some light on this issue.

alestanis
  • 20,205
  • 4
  • 45
  • 66
  • Yes! I agree this is most likely the reason why the original Doom 3 and Quake 3 codes calculate the square root like that. However, the question that remains is why the newer BFG code calculates `Sqrt(x)` as `x * InvSqrt(x)` _even though_ `InvSqrt()` is not specially optimized. – Robert Rüger Dec 04 '12 at 21:29
3

When code has been modified by multiple people, it becomes hard to answer questions about why it has its current form, especially without revision history.

However, given a third of a century of programming experience, this code fits the pattern others have mentioned: At one time, InvSqrt was fast, and it made sense to use it to compute the square root. Then InvSqrt changed, and nobody updated Sqrt.

Eric Postpischil
  • 141,624
  • 10
  • 138
  • 247
  • I agree, but I think then it's a little strange that the `Sqrt()` function was changed slightly for the BFG code. Someone must have looked at it which that made me suspicious as to whether there was still some magic going on here ... – Robert Rüger Dec 04 '12 at 21:45
2

It is also possible that they came across a relatively naive version of sqrtf which was notably slower for bigger numbers.

YvesgereY
  • 3,223
  • 1
  • 16
  • 18