7

I just found the following OpenGL specification for ARB_map_buffer_range.

I'm wondering if it is possible to do non-blocking map calls using this extension?

Currently in my application im rendering to an FBO which I then map to a host PBO buffer.

glMapBuffer(target_, GL_READ_ONLY);  

However, the problem with this is that it blocks the rendering thread while transferring the data.

I could reduce this issue by pipelining the rendering, but latency is a big issue in my application.

My question is whether i can use map_buffer_range with MAP_UNSYNCHRONIZED_BIT and wait for the map operation to finish on another thread, or defer the map operation on the same thread, while the rendering thread renders the next frame.

e.g.

thread 1:

map();
render_next_frame();

thread 2:

wait_for_map

or

thread 1:

map();
while(!is_map_ready())
   do_some_rendering_for_next_frame();

What I'm unsure of is how I know when the map operation is ready, the specification only mentions "other synchronization techniques to ensure correct operation".

Any ideas?

genpfault
  • 47,669
  • 9
  • 68
  • 119
ronag
  • 43,567
  • 23
  • 113
  • 204

2 Answers2

7

If you map a buffer with GL_MAP_UNSYNCHRONIZED_BIT, the driver will not wait until OpenGL is done with that memory before mapping it for you. So you will get more or less immediate access to it.

The problem is that this does not mean that you can just read/write that memory willy-nilly. If OpenGL is reading from or writing to that buffer and you change it... welcome to undefined behavior. Which can include crashing.

Therefore, in order to actually use unsynchronized mapping, you must synchronize your behavior to OpenGL's access of that buffer. This will involve the use of ARB_sync objects (or NV_fence if you're only on NVIDIA and haven't updated your drivers recently).

That being said, if you're using a fence object to synchronize access to the buffer, then you really don't need GL_MAP_UNSYNCHRONIZED_BIT at all. Once you finish the fence, or detect that it has completed, you can map the buffer normally and it should complete immediately (unless some other operation is reading/writing too).

In general, unsynchronized access is best used for when you need fine-grained write access to the buffer. In this case, good use of sync objects will get you what you really need (the ability to tell when the map operation is finished).


Addendum: The above is now outdated (depending on your hardware). Thanks to OpenGL 4.4/ARB_buffer_storage, you can now not only map unsynchronized, you can keep a buffer mapped indefinitely. Yes, you can have a buffer mapped while it is in use.

This is done by creating immutable storage and providing that storage with (among other things) the GL_MAP_PERSISTENT_BIT. Then you glMapBufferRange, also providing the same bit.

Now technically, that changes pretty much nothing. You still need to synchronize your actions with OpenGL. If you write stuff to a region of the buffer, you'll need to either issue a barrier or flush that region of the buffer explicitly. And if you're reading, you still need to use a fence sync object to make sure that the data is actually there before reading it (and unless you use GL_MAP_COHERENT_BIT too, you'll need to issue a barrier before reading).

Nicol Bolas
  • 378,677
  • 53
  • 635
  • 829
  • GL_MAP_UNSYNCHRONIZED_BIT has now gotten a particular bad reputation as it was mentioned that it is causing a client / server threading stall (https://www.slideshare.net/CassEveritt/approaching-zero-driver-overhead Slide 22). Can you comment? – Christopher Oezbek Mar 25 '18 at 21:19
6

In general, it is not possible to do a "nonblocking map", but you can map without blocking.

The reason why there can be no "nonblocking map" is that the moment the function call returns, you could access the data, so the driver must make sure it is there, positively. If the data has not been transferred, what else can the driver do but block.
Threads don't make this any better, and possibly make it worse (adding synchronisation and context sharing issues). Threads cannot magically remove the need to transfer data.

And this leads to how to not block on mapping: Only map when you are sure that the transfer is finished. One safe way to do this is to map the buffer after flipping buffers or after glFinish or after waiting on a query/fence object. Using a fence is the preferrable way if you can't wait until buffers have been swapped. A fence won't stall the pipeline, but will tell you whether or not your transfer is done (glFinish may or may not, but will probably stall). Reading after swapping buffers is also 100% safe, but may not be acceptable if you need the data within the same frame (works perfectly for screenshots or for calculating a histogram for tonemapping, though).

A less safe way is to insert "some other stuff" and hope that in the mean time the transfer has completed.


In respect of below comment:
This answer is not incorrect. It isn't possible to do any better than access data after it's available (this should be obvious). Which means that you must sync/block, one way or the other, there is no choice.
Although, from a very pedantic point of view, you can of course use GL_MAP_UNSYNCHRONIZED_BIT to get a non-blocking map operation, this is entirely irrelevant, as it does not work unless you explicitly reproduce the implicit sync as described above. A mapping that you can't safely access is good for nothing.

Mapping and accessing a buffer that OpenGL is transferring data to without synchronizing/blocking (implicitly or explicitly) means "undefined behavior", which is only a nicer wording for "probably garbage results, maybe crash".
If, on the other hand, you explicitly synchronize (say, with a fence as described above), then it's irrelevant whether or not you use the unsynchronized flag, since no more implicit sync needs to happen anyway.

Damon
  • 61,669
  • 16
  • 122
  • 172
  • Could you explain a bit further regarding this "query object"? Sound like I could use that to defer the call to map if its not ready. – ronag Aug 06 '11 at 15:11
  • 2
    That refers to [ARB_sync](http://www.opengl.org/registry/specs/ARB/sync.txt) which is in core 3.2 too. This lets you insert synchronisation queries ("fences") into the command stream, which you can either wait on (client or client+server) or query the completion status. If the sync query returns that it's done, it means that all commands that happened before you inserted the fence have completed. Thus, in your case, you know that the data is where you want it, and mapping the buffer will not block. – Damon Aug 06 '11 at 15:18
  • Excellent, so if I understand this correctly I should put such a sync query directly after my call to glReadPixels (device->host transfer). Once I know that the query is ready, then the call to map would not block (or just very shortly)? – ronag Aug 06 '11 at 15:41
  • Yes, unless the driver is broken (let's hope not!) that's it. – Damon Aug 06 '11 at 15:41
  • Wierd ARB_sync isn't included with GLee – ronag Aug 06 '11 at 16:27
  • GL_NV_FENCE seems to be included. – ronag Aug 06 '11 at 16:33
  • 1
    -1: This answer is incorrect. You can use the GL_MAP_UNSYNCHRONIZED_BIT to have non-blocking map operations. However, you immediately become responsible for the consequences of potentially writing to memory that the GPU is reading/writing, and so forth. So you still need to use fences and such to do synchronization. – Nicol Bolas Aug 06 '11 at 22:10
  • @ronag: You may want to abandon GLee; it doesn't seem to be supported/updated anymore, and it lacks many recent features. – Nicol Bolas Aug 06 '11 at 22:11
  • 1
    @ronag: Make sure you're using the svn version, not the latest release. I did a bunch of work a couple months ago to get GLee up to date, but the author hasn't released another version yet. – Ben Voigt Aug 06 '11 at 22:22