3

I need both a stencil and depth buffer for my OpenGL app. The depth buffer at least needs to be rendered to texture via a framebuffer object, so that I can do deferred shading and other post-processing effects. I already have this framebuffer set up (using GL_DEPTH24_STENCIL8), but I have some concerns and questions.

First, I would like to use a 32-bit floating-point depth buffer. The GL_DEPTH32F_STENCIL8 option seems to be the most obvious. What I would like to know is, what is the actual memory footprint of this format? Logically it would be 40 bits, but knowing what I do about alignment, it wouldn't surprise me if they padded it to 64, and many sources say that's exactly what happens. I would like to know for sure.

Perhaps keeping the depth and stencil buffers separate would be better for me? Do I have to worry about this not being supported? How about cache efficiency, as stencil and depth tests are often performed together?

PS. I'm not using multisampling.

Haydn V. Harach
  • 1,185
  • 14
  • 30

1 Answers1

5

GL_DEPTH32F_STENCIL8 is a 64-bit format; 32-bit for depth, 8-bit for stencil and 24-bit for alignment.

Sometimes knowing both of the desktop graphics APIs comes in handy, as this is the same format that was added to D3D10. D3D makes the size of its formats much easier to grasp just by looking at their name.

In D3D, the format is known as DXGI_FORMAT_D32_FLOAT_S8X24_UINT:

   D32_FLOAT indicates that it stores 32-bit depth (floating-point)

   S8X24_UINT indicates that it stores 8-bit stencil + 24-bit unused (unsigned integer)

D3D is nice in that formats explicitly announce when there are unused bits for the purpose of padding (this is what Xn in a format indicates). There is no color-renderable 8-bit RGB format in D3D because that messes with alignment; they are all RGBX or some permutation of the 4 in order to produce a 32-bit pixel.


To give you an authoritative answer I will refer you to the extension which first added this format to GL:

GL_ARB_depth_buffer_float

Overview

[...]

Additionally, this extension provides new packed depth/stencil pixel formats (see EXT_packed_depth_stencil) that have 64-bit pixels consisting of a 32-bit floating-point depth value, 8 bits of stencil, and 24 unused bits. A packed depth/stencil texture internal format is also provided.

Andon M. Coleman
  • 39,833
  • 2
  • 72
  • 98
  • Thank you! Can I get around this 24-bit waste by using separate depth and stencil buffers? – Haydn V. Harach Mar 14 '14 at 19:22
  • Unfortunately, no. If you want a stencil buffer and a depth buffer in GL/D3D you have to pack them into a single buffer unfortunately :-\ You could have a depth buffer by itself or a stencil buffer by itself, but the second you need both you have to use a packed depth+stencil format. – Andon M. Coleman Mar 14 '14 at 21:55
  • So my only options are to waste 32 bits on an 8-bit stencil buffer, or give up on using a floating-point depth buffer? – Haydn V. Harach Mar 14 '14 at 23:54
  • You waste 24-bits, actually. The alternative is to give up the stencil buffer. It really boils down to what the hardware limitation you are working with is (I assume this is for deferred shading). If you are fill-rate limited, then you might consider skipping the stencil light volume coverage processing (you can use a simple proxy volume for the lights instead of a stencil test). If you are compute limited, then it might be worthwhile to waste a little bit of memory bandwidth with the awkward depth+stencil buffer configuration, because you will have to compute fewer lit fragments. – Andon M. Coleman Mar 15 '14 at 00:16
  • I don't plan to use the stencil buffer for light volume coverage but I do need it for portal rendering (not traditional culling portals, but Portal-style portals). An alternative to using a floating-point depth buffer is to use a logarithmic depth buffer, but that suffers from it's own issues (writing to gl_FragDepth = no gusta). – Haydn V. Harach Mar 15 '14 at 00:59
  • @HaydnV.Harach: As I recall this is an attempt to improve the precision of position reconstruction from the depth buffer, correct? Since D32_S8X24 wastes 24-bits, you could experiment with a normal D24_S8 format plus an extra 16-bit half-float buffer to store logarithmic or linear depth (and avoid messing with `gl_FragDepth` at the same time). You save 16-bits, but I have a feeling the format you are already trying will actually give better performance. You probably would not need a floating-point depth value if you did this though, fixed-point might work fine. – Andon M. Coleman Mar 15 '14 at 15:52