How do Shadertoy's audio shaders work?

Question

To start off, I couldn't really find any appropriate community to post this question in so I picked this one. I was wondering how the audio shaders of the popular webGL based shader tool worked, because when, even though I had obviously heard of 'normal' GLSL shaders, I first heard of shaders for procedurally generating audio, I was amazed. Any clues?

I'm voting to close this question as off-topic (even though it's something I'd like to hear about) — Alnitak, Jan 18 '16 at 16:38
@Alnitak I fully understand, but what other community would you suggest? — tbvanderwoude, Jan 18 '16 at 17:16
There's a WebGL developers mailing list - there, perhaps? https://groups.google.com/forum/#!forum/webgl-dev-list — Alnitak, Jan 18 '16 at 17:17

score 27 · Accepted Answer · edited Jun 20 '20 at 09:12

They are basically a function that given time returns 2 values for an audio single (left and right channel). The values go from -1 to 1.

paste in this shader and maybe you'll get it

vec2 mainSound( float time )
{
    return vec2( sin(time * 1000.0), sin(time * 1000.0) );
}

You can see a more live example of a similar style of making sounds here.

You can imagine it like this

function generateAudioSignal(time) {
   return Math.sin(time * 4000); // generate a 4khz sign wave.
}

var audioData = new Float32Array(44100 * 4); // 4 seconds of audio at 44.1khz
for (var sample = 0; sample < audioData.length; ++sample) {
  var time = sample / 44100;
  audioData[sample] = generateAudioSignal(time);
}

Now pass audioData to the Web Audio API

For stereo it might be

function generateStereoAudioSignal(time) {
   return [Math.sin(time * 4000), Math.sin(time * 4000)]; // generate a 4khz stereo sign wave.
}

var audioData = new Float32Array(44100 * 4 * 2); // 4 seconds of stereo audio at 44.1khz
for (var sample = 0; sample < audioData.length; sample += 2) {
  var time = sample / 44100 / 2;
  var stereoData = generateAudioSignal(time);
  audioData[sample + 0] = stereoData[0];
  audioData[sample + 1] = stereoData[1];
}

There's really no good reason for them to be in WebGL (assuming they are). In WebGL you'd use them to generate data into a texture attached to a framebuffer. Then the data they generate would have to be copied back from the GPU into main memory using gl.readPixels and then passed into the Web Audio API which would be slow and, at least in WebGL it would block processing as there's no way to asynchronously read data back in WebGL. On top of that you can't easily read back float data in WebGL. Of course if shadertoy really is using WebGL then it could re-write the audio shader to encode the data into 8bit RGBA textures and then convert it back to floats in JavaScript. Even more reason NOT to use WebGL for this. The main reason to use WebGL is it just makes it symmetrical. All shaders using the same language.

The bytebeat example linked above is fully run in JavaScript. It defaults to bytebeat meaning the value the function is expected to return is 0 to 255 unsigned int but there's a setting for floatbeat in which case it expects a value from -1 to 1 just like shadertoy's shaders.

Update

So I checked Shadertoy and it is using WebGL shaders and it is encoding the values into 8bit textures

Here's an actual shader (I used the chrome shader editor to easily look at the shader).

precision highp float;

uniform float     iChannelTime[4];
uniform float     iBlockOffset; 
uniform vec4      iDate;
uniform float     iSampleRate;
uniform vec3      iChannelResolution[4];
uniform sampler2D iChannel0;
uniform sampler2D iChannel1;
uniform sampler2D iChannel2;
uniform sampler2D iChannel3;

vec2 mainSound( float time )
{
    return vec2( sin(time * 1000.0), sin(time * 1000.0) );
}

void main() {
   // compute time `t` based on the pixel we're about to write
   // the 512.0 means the texture is 512 pixels across so it's
   // using a 2 dimensional texture, 512 samples per row
   float t = iBlockOffset + ((gl_FragCoord.x-0.5) + (gl_FragCoord.y-0.5)*512.0)/iSampleRate;

   // Get the 2 values for left and right channels
   vec2 y = mainSound( t );

   // convert them from -1 to 1 to 0 to 65536
   vec2 v  = floor((0.5+0.5*y)*65536.0);

   // separate them into low and high bytes
   vec2 vl = mod(v,256.0)/255.0;
   vec2 vh = floor(v/256.0)/255.0;

   // write them out where 
   // RED   = channel 0 low byte
   // GREEN = channel 0 high byte
   // BLUE  = channel 1 low byte
   // ALPHA = channel 2 high byte
   gl_FragColor = vec4(vl.x,vh.x,vl.y,vh.y);
}

This points out one advantage to using WebGL in this particular case is that you get all the same inputs to the audio shader as the fragment shaders (since it is a fragment shader). That means for example the audio shader could reference up to 4 textures

In JavaScript then you'd read the texture with gl.readPixels then convert the sample back into floats with something like

   var pixels = new Uint8Array(width * height * 4);
   gl.readPixels(0, 0, width, height, gl.RGBA, gl.UNSIGNED_BYTE, pixels);
   for (var sample = 0; sample < numSamples; ++sample) {
     var offset = sample * 4;  // RGBA
     audioData[sample * 2    ] = backToFloat(pixels[offset + 0], pixels[offset + 1]);
     audioData[sample * 2 + 1] = backToFloat(pixels[offset + 2], pixels[offset + 3]);
   }

   float backToFloat(low, high) {
     // convert back to 0 to 65536
     var value = low + high * 256;

     // convert from 0 to 65536 to -1 to 1
     return value / 32768 - 1;
   }

Also, while I said above I didn't think it was a good idea I assumed that shadertoy was constantly calling the audio shader and therefore the issue I brought up about blocking processing would be true but, ... apparently shadertoy just pre-generates N seconds of audio using the shader when you press play where N is apparently 60 seconds. So, there's no blocking but then again the sound only lasts 60 seconds.

This is really interesting :D. Thanks a lot for sharing your knowledge and disabusing me of thinking that it was some sort of shader (Probably because the syntax resembles GLSL/HLSL). — tbvanderwoude, Jan 19 '16 at 14:27
It is GLSL. My bad for originally suggesting it wasn't. I updated the answer — gman, Jan 19 '16 at 15:00
@gman "in WebGL it would block processing as there's no way to asynchronously read data back in WebGL" is this the case in WebGL2? There is no PBO or other workaround for asynchronous read back? — Startec, Dec 10 '18 at 02:39
and @gman isn't it possible that they are two different shader programs that are run in parallel? — Startec, Dec 10 '18 at 04:08
@Startec, There is no "parallel" as WebGL only runs one thing at a time and so does the GPU with regards to a single WebGL draw call. It is possible since I wrote this answer nearly 3 years ago that Shadertoy changed to doing something else including calling the audio shader every frame to compute a little at a time or calling it once every N seconds to generate N more seconds in advance. 3 years ago it didn't do that. As for async readback, even in WebGL2 that doesn't exist. There is talk about adding an extension. — gman, Dec 10 '18 at 07:28

How do Shadertoy's audio shaders work?

1 Answers1

Update