12

I'm trying to do vertical synced renders so that exactly one render is done per vertical sync, without skipping or repeating any frames. I would need this to work under Windows 7 and (in the future) Windows 8.

It would basically consist of drawing a sequence of QUADS that would fit the screen so that a pixel from the original images matches 1:1 a pixel on the screen. The rendering part is not a problem, either with OpenGL or DirectX. The problem is the correct syncing.

I previously tried using OpenGL, with the WGL_EXT_swap_control extension, by drawing and then calling

SwapBuffers(g_hDC);
glFinish();

I tried all combinations and permutation of these two instructions along with glFlush(), and it was not reliable.

I then tried with Direct3D 10, by drawing and then calling

g_pSwapChain->Present(1, 0);
pOutput->WaitForVBlank();

where g_pSwapChain is a IDXGISwapChain* and pOutput is the IDXGIOutput* associated to that SwapChain.

Both versions, OpenGL and Direct3D, result in the same: The first sequence of, say, 60 frames, doesn't last what it should (instead of about 1000ms at 60hz, is lasts something like 1030 or 1050ms), the following ones seem to work fine (about 1000.40ms), but every now and then it seems to skip a frame. I do the measuring with QueryPerformanceCounter.

On Direct3D, trying a loop of just the WaitForVBlank, the duration of 1000 iterations is consistently 1000.40 with little variation.

So the trouble here is not knowing exactly when each of the functions called return, and whether the swap is done during the vertical sync (not earlier, to avoid tearing).

Ideally (if I'm not mistaken), to achieve what I want, it would be to perform one render, wait until the sync starts, swap during the sync, then wait until the sync is done. How to do that with OpenGL or DirectX?

Edit: A test loop of just WaitForVSync 60x takes consistently from 1000.30ms to 1000.50ms. The same loop with Present(1,0) before WaitForVSync, with nothing else, no rendering, takes the same time, but sometimes it fails and takes 1017ms, as if having repeated a frame. There's no rendering, so there's something wrong here.

slazaro
  • 248
  • 3
  • 8

6 Answers6

3

I have the same problem in DX11. I want to guarantee that my frame rendering code takes an exact multiple of the monitor's refresh rate, to avoid multi-buffering latency.

Just calling pSwapChain->present(1,0) is not sufficient. That will prevent tearing in fullscreen mode, but it does not wait for the vblank to happen. The present call is asynchronous and it returns right away if there are frame buffers remaining to be filled. So if your render code is producing a new frame very quickly (say 10ms to render everything) and the user has set the driver's "Maximum pre-rendered frames" to 4, then you will be rendering four frames ahead of what the user sees. This means 4*16.7=67ms of latency between mouse action and screen response, which is unacceptable. Note that the driver's setting wins - even if your app asked for pOutput->setMaximumFrameLatency(1), you'll get 4 frames regardless. So the only way to guarantee no mouse-lag regardless of driver setting is for your render loop to voluntarily wait until the next vertical refresh interval, so that you never use those extra frameBuffers.

IDXGIOutput::WaitForVBlank() is intended for this purpose. But it does not work! When I call the following

<render something in ~10ms>
pSwapChain->present(1,0);
pOutput->waitForVBlank();

and I measure the time it takes for the waitForVBlank() call to return, I am seeing it alternate between 6ms and 22ms, roughly.

How can that happen? How could waitForVBlank() ever take longer than 16.7ms to complete? In DX9 we solved this problem using getRasterState() to implement our own, much-more-accurate version of waitForVBlank. But that call was deprecated in DX11.

Is there any other way to guarantee that my frame is exactly aligned with the monitor's refresh rate? Is there another way to spy the current scanline like getRasterState used to do?

t-skin
  • 31
  • 2
1

I previously tried using OpenGL, with the WGL_EXT_swap_control extension, by drawing and then calling

SwapBuffers(g_hDC);
glFinish();

That glFinish() or glFlush is superfluous. SwapBuffers implies a glFinish.

Could it be, that in your graphics driver settings you set "force V-Blank / V-Sync off"?

datenwolf
  • 149,702
  • 12
  • 167
  • 273
  • It is set to whatever each application chooses. My renders do more-or-less sync to the vertical blank, but the problem is that it's not consistent, even with (mostly) empty renders. – slazaro May 08 '12 at 14:45
  • @slazaro: How do you measure this? I mean with an empty render there's no indicator you could use. If you're using the system timer, then you should be aware that this will always be jittering around the V-Sync interval, never exactly hitting it – datenwolf May 08 '12 at 14:57
  • With DirectX, when I have just a loop of `WaitForVBlank`, it consistently marks the same time, with a small variation, less than 1ms. On the other hand, with a render and swapbuffers with vsync activated, each render takes approximately 16.6ms, except some that take two frames (~33.3ms). Given that I don't spend that long rendering (a 60x loop without vsync takes like 30ms), there's some unreliability, and I think it's more than can be blamed on the precision of timers. – slazaro May 08 '12 at 15:19
  • Well 16.6ms is quite exactly 60Hz. And doubling that means, that your render did not quite make the deadline and hence you're blocking to the next V-Sync. So it actually does V-Sync, but your rendering takes to long (or something else eats up your time). – datenwolf May 08 '12 at 15:26
  • I added some more info as an edit. My only explanation would be that windows makes the loop skip a vsync by scheduling... – slazaro May 08 '12 at 15:33
1

We use DX9 currently, and want to switch to DX11. We currently use GetRasterState() to manually sync to the screen. That goes away in DX11, but I've found that making a DirectDraw7 device doesn't seem to disrupt DX11. So just add this to your code and you should be able to get the scanline position.

IDirectDraw7* ddraw = nullptr;
DirectDrawCreateEx( NULL, reinterpret_cast<LPVOID*>(&ddraw), IID_IDirectDraw7, NULL );
DWORD scanline = -1;
ddraw->GetScanLine( &scanline );
cowtung
  • 81
  • 8
1

On Windows 8.1 and Windows 10, you can make use of the DXGI 1.3 DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT. See MSDN. The sample here is for Windows 8 Store apps, but it should be adaptable to class Win32 windows swapchains as well.

You may find this video useful as well.

Chuck Walbourn
  • 28,931
  • 1
  • 45
  • 72
0

When creating a Direct3D device, set PresentationInterval parameter of the D3DPRESENT_PARAMETERS structure to D3DPRESENT_INTERVAL_DEFAULT.

miloszmaki
  • 1,528
  • 13
  • 21
  • Is that also available in DX10? I am creating the device with `D3D10CreateDeviceAndSwapChain`, which as far as I can tell doesn't have any Present Parameters. – slazaro May 08 '12 at 14:47
  • I'm afraid DX10 doesn't have this option. Sorry, I'm not much into DX10. – miloszmaki May 08 '12 at 19:22
  • 1
    Actually you want to set it too: D3DPRESENT_INTERVAL_ONE... and D3D10 does vsync through its present method 'swapChain->Present(vSync, 0);' – zezba9000 May 18 '12 at 09:52
-1

If you run in kernel-mode or ring-0, you can attempt to read bit 3 from the VGA input register (03bah,03dah). The information is quite old but although it was hinted here that the bit might have changed location or may be obsoleted in later version of Windows 2000 and up, I actually doubt this. The second link has some very old source-code that attempts to expose the vblank signal for old Windows versions. It no longer runs, but in theory rebuilding it with latest Windows SDK should fix this.

The difficult part is building and registering a device driver that exposes this information reliably and then fetching it from your application.

StarShine
  • 1,632
  • 23
  • 40