0

I use this code:

MainLoop() {
    for (int i = 0; i < length; i++) {
        XMVector3Rotate(rays[i], orientation);
    }
}

and I have fps 1900000, but when I use this one:

MainLoop() {
    for (int i = 0; i < length; i++) {
        calculatedRays[i] = XMVector3Rotate(rays[i], orientation);
    }
}

I have fps = 200. Why?

Andrew Russell
  • 26,137
  • 7
  • 54
  • 104
itun
  • 3,121
  • 11
  • 47
  • 72
  • 3
    If you say FPS = 1900000 (not 190.0000), it's impossible with current hardware, you have an error in your FPS computation (frame time = 0.0000001 sec). – jv42 May 10 '11 at 15:02
  • I am writing a software renderer, and this is FPS with use of the described loop in the main render loop. This mean that in the one second there are 1900000 frames with loop calculations. – itun May 10 '11 at 15:07
  • 1
    OK... you must be doing nearly nothing then (we're talking sub-micro-second frame time). And so how is this related to XNA? – jv42 May 10 '11 at 15:16
  • 1
    I've edited your tags and title. That's a DirectX function, not an XNA function. – Andrew Russell May 10 '11 at 15:19
  • 1
    Ah... I see where the XNA tag comes from. `XMVector3Rotate` is from the *very confusingly named* [XNA Math Library](http://msdn.microsoft.com/en-us/library/ee415574(v=VS.85).aspx). Which is part of DirectX. Not XNA. (Nice one, Microsoft.) – Andrew Russell May 10 '11 at 15:30
  • It took Microsoft years to figure out what XNA actually was haha (at first it was a build system like MSBuild for games) – MattDavey May 10 '11 at 15:33

3 Answers3

7

When you are doing this:

XMVector3Rotate(rays[i], orientation);

I am guessing that the compiler inlines the function - and sees that, because its result is never asigned anywhere - it doesn't actually do anything, and removes the function call completely. It's very fast because it's not actually doing anything.

But then when you add in the assignment:

calculatedRays[i] = XMVector3Rotate(rays[i], orientation);

All of a sudden you're doing a bunch of memory reads and writes and various maths operations - all of which were being skipped over before.

(You had tagged this XNA -- but this is a C++ function. Most C++ compilers can and will inline functions like this. The standard C# compiler cannot.)

Andrew Russell
  • 26,137
  • 7
  • 54
  • 104
  • +1 beat me to it - the compiler is optimizing the line away in the first scenario. And seeing as it's the only content in the for loop, the compiler probably drops that too... – MattDavey May 10 '11 at 15:26
  • If compiler avoid this string, why fps is so small? Because when I use my own code without asm, simd, FPS = 400 – itun May 10 '11 at 15:28
  • 1
    If the OP's test program is as simple as what he posted, I suppose the compiler optimized away the entire loop, not just the loop body. – Robᵩ May 10 '11 at 15:35
  • @itun In the first case (no assignment) - the compiler outputting nothing for that line. In the second case - the compiler outputs *something*. If you have your own alternative for `XMVector3Rotate` that is twice as fast - then good for you! If you're sure it is correct, then feel free to use it instead. (Benefit of using Microsoft's library: they've tested the heck out of it). – Andrew Russell May 10 '11 at 15:37
2

In the first example the result of the function is being discarded right away (not being assigned). The compiler is smart enough to sense that, and omits the method call...

MattDavey
  • 8,395
  • 3
  • 29
  • 54
0

Assuming XMVector3Rotate returns a XNA Vector3 type, this will be a struct copy operation, relatively costly in performance.

When optimizing my own XNA game for the XBox360, I replaced many such operations by ref params type, with a very noticeable gain in heavy loops.

EDIT: example (from memory)

Vector3 vec1 = something, vec2 = something, result;
Vector3.Add(ref vec1, ref vec2, out result);
jv42
  • 8,308
  • 3
  • 36
  • 60