Combining multiple pixel shaders efficiently

Question

So I'm making a thing with XNA 3.1, and I have a lot of separate effects that are applied via pixel shaders. These come from all sorts of sources, such as special attacks, environment, and so forth. The issue I'm having is that I'm noticing a significant reduction in frame rate.

At the moment, I'm drawing the entire scene to a RenderTarget2D, which I'm then applying all the effects to. I store a SortedDictionary containing the effects and their IDs (the IDs are used to change parameters at runtime), and I'm iterating over it and applying each effect one after the other:

foreach(KeyValuePair<Ref<int>,Effect> p in renderEffects)
{
    Effect r = p.Value;
    g.SetRenderTarget(0, MainGame.MainRenderTarget);
    //Change RenderTarget to allow code to grab existing texture in the same draw area.
    levelDraw = MainGame.LevelRenderTarget.GetTexture();
    //Change back to draw back to this texture, allowing render effects to be layered.
    g.SetRenderTarget(0, MainGame.LevelRenderTarget);

    MainGame.StartDraw(MainGame.GameBatch);
    //Starts the sprite batch and sets some parameters
    r.Begin();
    r.CurrentTechnique.Passes[0].Begin();
    MainGame.GameBatch.Draw(levelDraw, new Rectangle(0, 0, levelDraw.Width, levelDraw.Height), Color.White);
    r.CurrentTechnique.Passes[0].End();
    r.End();
    MainGame.GameBatch.End();
}

Now, this produces noticeable frame drops when layering just 3 effects, and when applying 10, it drops from 60FPS to 16FPS, which is of course unacceptable. I'm wondering if there is a more efficient way to do this. Considering I only have one texture, I considered that I may be able to combine the effects into one file and execute multiple passes without grabbing the texture back. I'm not sure if this is possible, however.

I'm not really sure exactly how the best way to do this is, though I imagine there must be a better way than the way I'm doing it.

I have edited your title. Please see, "[Should questions include “tags” in their titles?](http://meta.stackexchange.com/questions/19190/)", where the consensus is "no, they should not". — John Saunders, Dec 11 '12 at 15:00
Thanks for that. I tend to include them to allow people to know if they can help or not at a glance, rather than having to look through the tags, but I can see the logic in not using them there. — Hoeloe, Dec 11 '12 at 17:13
The tags include a mechanism better than looking at a garbled title. You can set "favorite tags" which will cause messages with those tags to be highlighted; and "ignored tags" which will cause messages with those tags to be hidden. — John Saunders, Dec 11 '12 at 17:15
Oh, I know it's a better system. It's more habit from various sites that don't have such a system. — Hoeloe, Dec 11 '12 at 18:25
It would be great if you could find out which stages are the slowest — applying each effect, switching effects, or something in between those actions. Or maybe it's just you're using full hd resolution, so it's naturally slow. — user1306322, Dec 14 '12 at 15:29
I did a few tests. Applying the shaders themselves drops the rate from 60 to about 50 or 40. Swapping out the RenderTarget, grabbing the texture from it, and then swapping it back does the rest of it. — Hoeloe, Dec 16 '12 at 13:21

score 2 · Answer 1 · answered Dec 30 '12 at 08:40

2

The method in the snippet is likely to be very slow, because you're doing a texture grab and a full screen draw for every effect, which stresses the memory bandwidth between the CPU and GPU on top of whatever is going on inside the shaders. You probably need, as you suggested in your post, to create a set of shaders which each contain multiple operations rather than running the read-write loop over and over again: one expensive shader will usually still be faster than many read-write-repeats of simple shaders.

You might want to look at Shawn Hargreaves article on shader fragments in HLSL and Tim Jones's code for doing this in XNA

answered Dec 30 '12 at 08:40

theodox

11,352
3
20
35

Tim Jones' code is useless to me, as it is in XNA 4, while I'm using 3.1. Thanks for linking these articles, though. – Hoeloe Dec 30 '12 at 12:18
For your problem size you may be better of doing the combinatorial thing manually anyway -- you'll probably find that stacking the effects makes things harder to visually decipher, which means the complex multi-effect versions could be optimized to make them cheaper without detracting from the look. – theodox Dec 30 '12 at 20:50
This has got me thinking, and I'm considering writing a parser and compiler to take a list of HLSL shaders, each with with specific parameters, and push them all into one file. It would read the code as a string, then convert the code from string form into a byte array, which can be written into a new Effect file. I'm not sure if this will work, or if it will be any good, but it might be worth a try. I'm just not sure how to get the original code as a string from the file... – Hoeloe Dec 30 '12 at 21:03

doug65536 · Answer 2 · 2012-12-23T23:17:33.843

1

Are you fully shading everything as you draw? If your shaders are computation heavy, you should do a "depth pass" first, only ztesting/writing the Z buffer (color buffer writes are off). Also, use trivially simple shader to "depth fill" the screen.

In other words, render all opaque objects updating only the depth buffer on the first pass.

On the second pass, you turn on the shaders (and turn off depth writes, no need to use bandwidth to write back the same value that's already there). This will mask off all the unnecessary work of any overdrawn pixels, since they will fail the depth test immediately.

EDIT: now I realize OP is doing fullscreen effects, not the scene render.

edited Dec 23 '12 at 23:17

answered Dec 22 '12 at 03:03

doug65536

6,014
2
37
51

I'm only applying the shaders to a single texture, so there should be no issue with the depth buffer, since each pixel is only ever drawn once for each pass, and no pixels will ever be overdrawn in the depth buffer. – Hoeloe Dec 23 '12 at 20:40
I see what you mean, you just "billboard" a giant quad to apply a series of fullscreen effects and that's what is bottlenecking performance? – doug65536 Dec 23 '12 at 23:15
Is it possible to sort your effects loop to minimize state changes? Generally, in decreasing expensiveness: shader program, texture binding, buffer binding, other. Perhaps you can avoid state changes by reordering the passes to minimize state changes. Vertex buffers and textures can be tiled onto larger textures/bigger vertex buffers to eliminate some of the states from ever changing. – doug65536 Dec 23 '12 at 23:59
My effects are all pixel shaders, and they have to be applied in a specific order to achieve the effect I need (for example, a glow effect applied before a ripple effect will produce a different result to a ripple applied before a glow). – Hoeloe Dec 24 '12 at 10:53
I would instrument it. Use a Stopwatch object to measure the nanoseconds of each of the begin/draw/end lines, and log out to debug the timings to see which effect and/or operation is your actual bottleneck. – doug65536 Dec 26 '12 at 04:34
You might need to flush the batch or slightly reorganize the loop (make a different function for performance profiling purposes) to force the work to complete to make the timings meaningful. – doug65536 Dec 26 '12 at 04:42
It could be very difficult to tell, since the list I'm iterating through is completely dynamic (i.e. I'm adding, removing and altering effects pretty much constantly). Besides that, I'm running several almost identical effects (glow effects with the same intensity and radius, but a different colour and centre point), so it's not any specific effect that's causing it, but the combination of many. Regardless, when I can (I'm away from home at the moment), I'll test each command in turn and see what happens. – Hoeloe Dec 26 '12 at 18:15

Combining multiple pixel shaders efficiently

2 Answers2