Journal Entry 21 – Instanced drawing in OpenGL

In this article I’m going to talk about instanced drawing in OpenGL. Instanced drawing is useful when you want to draw several of the same thing. You can use buffer objects to store attributes such as color, position, etc so that each of your ‘same thing’ is rendered with different attributes. For example, you could have a soldier on a battlefield that you would like to render 1000 times, each time with a different orientation. The old way would be to use a ‘for’ loop and then call Draw() on the soldier VAO 1000 times. With instanced drawing we set up a new buffer object containing the 1000 different model matrices, and then issue a single draw call.

Instead of soldiers, we’re rendering voxels. Specifically, we’re rendering some falling voxels. When updating the falling voxels I end up with a list of model matrices (Matrix4 objects) and colors (Vector3 objects). Now we have to copy the matrices and colors to their own respective buffers. (As an exercise to the reader, you could simply interleave them into a single buffer). Since a Matrix4 is not a primitive we must copy it to a new float[].

float[] mm = new float[voxels.Count * 16];
for (int i = 0; i < voxels.Count; i++) matrices[i].ToFloat().CopyTo(mm, i * 16);
 
if (matrixBuffer != 0) Gl.DeleteBuffers(1, new uint[] { matrixBuffer });
matrixBuffer = Gl.CreateVBO<float>(BufferTarget.ArrayBuffer, mm, BufferUsageHint.StaticRead);
 
if (colorBuffer != 0) Gl.DeleteBuffers(1, new uint[] { colorBuffer });
colorBuffer = Gl.CreateVBO<Vector3>(BufferTarget.ArrayBuffer, colors, BufferUsageHint.StaticRead);

You’ll notice that since we are updating the matrices and colors every frame, I make sure to delete the old buffer objects before creating the new ones.

Next we set up the matrix vertex attribute array. This is kind of strange, but OpenGL cannot load an entire mat4 using VertexAttribPointer. Instead, it breaks the mat4 up into 4 vec4 objects, so we need to create 4 vertex attribute arrays to deal with the mat4.

// find the location of our 'model_matrix' mat4 shader attribute
uint loc = (uint)Gl.GetAttribLocation(program.ProgramID, "model_matrix");
 
// bind our matrix buffer to all four attribute arrays (note: mat4 is actually 4 vec4 arrays)
Gl.EnableVertexAttribArray(loc);
Gl.EnableVertexAttribArray(loc + 1);
Gl.EnableVertexAttribArray(loc + 2);
Gl.EnableVertexAttribArray(loc + 3);
Gl.BindBuffer(BufferTarget.ArrayBuffer, matrixBuffer);
Gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 16 * 4, (IntPtr)0);
Gl.VertexAttribPointer(loc + 1, 4, VertexAttribPointerType.Float, false, 16 * 4, (IntPtr)16);
Gl.VertexAttribPointer(loc + 2, 4, VertexAttribPointerType.Float, false, 16 * 4, (IntPtr)32);
Gl.VertexAttribPointer(loc + 3, 4, VertexAttribPointerType.Float, false, 16 * 4, (IntPtr)48);
Gl.VertexAttribDivisor(loc, 1);
Gl.VertexAttribDivisor(loc + 1, 1);
Gl.VertexAttribDivisor(loc + 2, 1);
Gl.VertexAttribDivisor(loc + 3, 1);

This code has something new! We use VertexAttribDivisor to let OpenGL know how often to increment the attribute pointer. 0 would keep the same pointer for all instances, 1 updates the pointer once per instance, 2 updates the pointer once every 2 instances, and so on.

Next we set up our color vertex attribute array, which is a bit more straight forward.

// find the location of our 'voxel_color' vec3 shader attribute
loc = (uint)Gl.GetAttribLocation(program.ProgramID, "voxel_color");
 
// bind our color buffer to the attribute array
Gl.EnableVertexAttribArray(loc);
Gl.BindBuffer(BufferTarget.ArrayBuffer, colorBuffer);
Gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 12, (IntPtr)0);
Gl.VertexAttribDivisor(loc, 1);

Finally, we bind the attributes that the voxel requires (such as vertices, normals, etc). This can be done by simply calling BindAttributes on the Vertex Array Object (VAO). Note: I only just made this method public in the latest version of the OpenGL for C# code. Make sure you’re using the most up to date version! Check out the GitHub repo for the latest.

// bind all the VAO buffers
voxel.BindAttributes(program);
 
// issue an instanced draw call, which will draw our VAO voxels.Count times
Gl.DrawElementsInstanced(BeginMode.Triangles, voxel.VertexCount, DrawElementsType.UnsignedInt, IntPtr.Zero, voxels.Count);

We don’t need to make any modifications to the shader, aside from changing the ‘voxel_color’ and ‘model_matrix’ to attributes instead of uniforms. For posterity, here’s the GLSL code:

uniform mat4 projection_matrix;
uniform mat4 view_matrix;
 
attribute vec3 in_position;
attribute vec3 in_normal;
attribute mat4 model_matrix;
attribute vec3 voxel_color;
 
varying vec3 vertex_light_position;
varying vec3 vertex_normal;
varying vec3 vertex_color;
 
void main(void)
{
  vertex_normal = normalize((model_matrix * vec4(in_normal, 0)).xyz);
  vertex_light_position = normalize(vec3(0.5, 0.3, 0.2));
  vertex_color = voxel_color;
 
  gl_Position = projection_matrix * view_matrix * model_matrix * vec4(in_position, 1);
}

This gives a pretty decent speed bump of about 2.5x. The major slow-downs are now in actually calculating the new matrix values, as well as copying all those matrices over to a float[]. We’re sitting pretty at more than 1200fps though, so I don’t think there’s anything to worry about!

I promise this is the last picture of exploding voxels for some time!  This is using the new instanced draw methods, which give more than a 2.5x increase in speed.
I promise this is the last picture of exploding voxels for some time! This is using the new instanced draw methods, which give more than a 2.5x increase in speed.

I’ve been spending a lot of time working on optimizations, which is the wrong thing to do. I need to work on gameplay, so I’m switching over to that from here on. I just thought it would be neat to try out instanced drawing, since I’ve never actually done it before.

Hope everyone is excited as I am for the gameplay code to come together! Cheers,

Giawa

PS: Bonus picture of 32768 voxels in wireframe mode, just because I wanted to see how it would look. Still ran at a decent speed (~50fps).

I did this just to stress test the instanced draw calls.  This is a full voxel chunk (32768) worth of falling voxels while in wireframe mode.
I did this just to stress test the instanced draw calls. This is a full voxel chunk (32768) worth of falling voxels while in wireframe mode.

Leave a Reply