Home

Blog

Bake AO

About

Contact

Home

Blog

Bake AO

About

Contact

Blog

Rendering

Rendering into texture - the most important TA skill

Aug 28, 2025

10 minutes

I explain my higher-level thinking about rendering and the most important skill each technical artist or graphics programmer should master: rendering custom content into a custom texture. I expect that you already have a basic knowledge of rendering and vertex/fragment shaders.

Concept of the GPU

Let me touch on the GPU and the problem it was designed to solve.

The idea is that you have a well-defined collection of triangles and want to streamline them into a 2D texture. GPUs were designed to solve this problem. Using high parallelization and dedicated hardware, they are incredibly efficient in firing triangles to the screen.

:center-100:

:image-description:

GPU is designed to efficiently streamline triangles into a texture.

GPU hardware is designed around rasterization, and a big part of this process is implemented in hardware. There are pieces of silicon dedicated just to solve most of the smaller problems, like: fetching vertices from VRAM, frustum culling, depth testing, figuring out which pixels overlap with triangles, interpolating vertex attributes, color blending, stencil tests, calculating texture mip-map, decompressing/compressing textures on the fly... The list goes on and on.

Using high parallelization, the GPU processes triangles into textures quickly. My 3060 RTX was able to process 10,000,000 triangles and render a full HD image in 5.67 milliseconds.

That's 1,763,668,430 triangles processed each second!

:image-description:

Visualization of how the GPU fires triangles during a single draw call.

If most of the rendering is implemented in hardware. We, graphics programmers, have limited control over the whole process. In a classic vertex-fragment pipeline, only two stages of rendering are programmable - the vertex shader and the fragment shader, and they have a very well-defined purpose.

Vertex shader - Executed for each vertex separately. Responsible for calculating the position of each vertex on the screen (clip space position). Additionally, we can calculate parameters that will be interpolated for a pixel shaded by the fragment shader. In vertex shader we can access only a single vertex at a time.

Fragment shader - Also known as Pixel Shader - Responsible for calculating the color of the pixel using the interpolated parameters returned by the vertex shader. In each fragment shader program we can only color one pixel at a time, no access to neighboring pixels.

:center-100:

:image-description:

The vertex shader calculates the position of each vertex on the screen. Then the fragment shader calculates the color of each pixel that overlaps with distributed triangles.

:center-100:

:image-description:

Shaders utilize various resources, such as textures or buffers (e.g., with lighting data), to calculate the pixel color. Vertex shaders can also utilize these resources to calculate position on the screen and various other parameters.

Our main tools

Let's examine all the tools available when using the vertex-fragment pipeline.

Thinking high-level, we are armed with those powerful tools:

Defining the input triangles.
Creating VRAM resources that will be used in shaders, like textures, buffers and global variables.
Calculating the position of those triangles on the screen and deciding what properties to use during pixel shading - with option to access textures and buffers.
Calculating the color of each pixel inside the rendered triangle - with option to access textures and buffers.
Rendering into a custom texture.

Now, look at those tools again and focus on the underlined:

Defining the input triangles
Creating VRAM resources that will be used in shaders, like textures, buffers and global variables.
Calculating the position of those triangles on the screen and deciding what properties should be used during pixel shading - with option to access textures and buffers.
Calculating the color of each pixel inside the rendered triangle - with option to access textures and buffers.
Rendering into a custom texture.

We can create custom textures, render into them, and reuse them in other shaders. Isn't that interesting?

This is a core idea of a render pipeline.

___

The concept of a render pipeline

The concept of rendering into textures and reusing them is a fundamental principle used by each render pipeline.

:center-100:

:image-description:

Simplified forward render pipeline in Unity.

The role of a render pipeline is to create a final texture that will be presented to the player. As a graphics programmer or technical artist, you can render and use as many textures as you want, as long as you present the content in a final texture.

The image above illustrates how Unity renders the image in the simplest forward rendering with shadows and postprocesses. You can see that Unity creates a few intermediate textures to achieve the final effect.

___

Render-texture-based techniques

After I said that this is one of the most important techniques, I feel obligated to explain at least some of them. Below, you will find a list of interesting techniques that creatively utilize textures.

Interactions with the ground

You can render trails of objects from a top-down perspective and them modulate the ground or other objects. In this case, I rendered trails of characters and used them to cut out fog, and to mess up the sand.

:center-100:

:center-100:

:image-description:

In this example, I rendered trails of characters into a separate top-down texture. Then I used the texture in fog and sand shaders.

Saving procedural noise performance

Some shaders use procedural noise to improve visuals. Procedural noise can morph in interesting ways in real time, but computing it for each pixel of every rendered object can be heavy on SM units.

In this case you can render lower-resolution noise at the beginning of a frame and reuse this texture in all shaders.

:center-100:

:image-description:

Water caustics is a good example. It is usually implemented using distorted, layered cell noise. It is extremely expensive for SM units, so rendering once into a lower-resolution texture and reusing it is more efficient. Btw, this is a noise I implemented in ShaderToy: https://www.shadertoy.com/view/ttSGz3

Directional light shadows

Directional lights usually render the scene depth from the light perspective using the orthographic camera. Then this depth is compared with the depth in the game camera to estimate shadows.

:center-100:

:image-description:

Shadowmap is used to estimate if the rendered surface was visible from the light's perspective.

Screen space ambient occlusion

Screen space ambient occlusion works by rendering depth and normals of the objects on the screen. Then, the depth+normals texture is used when rendering into the ambient occlusion texture. The algorithm calculates ambient occlusion based on nearby samples.

:center-100:

Minimap rendering

Minimap rendering often works by rendering simplified objects into a top-down texture and then displaying the texture on the UI.

:center-100:

Example of a minimap rendering in action. Source: https://theknightsofu.com/implementing-minimap-unity-2/

Texture-based instancing

The idea is similar to terrain splatmap rendering. But you use splatmap data inside a compute shader with AppendStructuredBuffer to create buffer with tree instances.

You launch a compute shader for each splatmap pixel; if it contains some color, there is a chance to spawn a tree. You append the tree position and size to an instance buffer.

Then you copy the instance buffer counter into args buffer and use it with indirect instancing.

:center-100:

___

Core knowledge to master

Now you should have some understanding of how custom-rendered textures can be used. Let's talk about what is needed to master this. The goal of this article is to increase your awareness of the topic, not to explain each technique in detail.

In the simplest form, you could use a separate camera and render the content into a texture. Unity is handy with this feature. However, there is a high performance overhead for having another camera in the scene because it needs to do many things to render its content.

A more complicated but much more performant way is to understand what the camera does to render the scene into a texture and only implement what you need. It's like implementing a barebones camera with limited features.

In this section, I will go through the camera-related tasks.

1. Creating a texture

First of all, you need to allocate a texture. Usually, you want to use a color buffer. You need to pick the resolution, as well as the format of this texture. Do you want to render only the R channel with 8-bit precision? Or RGBA with float precision? All of that affects the texture size and the sampling performance when the texture is used in the shaders.

In some cases, you will need to allocate a depth buffer - in case you want to render overlapping 3D opaque objects.

:center-100:

:image-description:

In Unity, if you implemented rendering into a texture, you will notice it in the frame debugger.

2. Create view-projection matrices

GPU requires converting vertex positions into clip-space in the vertex shader.

Clip space is a space where everything with XY coordinates in the range [-1, 1] and Z (depth) in [0, 1] is rendered into a screen. You need to have a way to convert a world space position into clip space.

It is typically accomplished by transforming the world space into a camera space (the object space of the camera transform) using the View Matrix. Then, the projection matrix is applied to convert the object-space position into the [-1, 1] range, according to the frustum shape.

:center-100:

:image-description:

Source: https://nbertoa.wordpress.com/2017/01/21/how-can-i-find-the-pixel-space-coordinates-of-a-3d-point-part-3-the-depth-buffer/, author: Nicolás Bertoa

3. Filtering / tracking objects

You want to render some objects into a texture, right? So you need to keep track of those objects manually. From all the objects in the scene, you should have a fast method of filtering your objects.

:center-100:

:image-description:

When implementing custom rendering, it's your responsibility to track each of those objects yourself. I like to keep a global collection of tracked objects. I register the object when it's enabled, and unregister when it's disabled.

4. Frustum culling

Now, you track your objects. So in each frame, you need to find what objects need to be rendered. Frustum culling is used to establish which of the objects are inside the camera frustum.

:center-100:

:image-description:

Visualization of a frustum culling.

In a simple scenario, frustum culling is not worth implementing. If your rendering is fully instanced, such as rendering quads or simple meshes, you can just render them and let the GPU's primitive clipping handle it. I recommend skipping frustum culling and measuring performance on the target device. If there is a performance issue, then implement frustum culling.

Frustum culling is often required when each single object is costly to render, and there are a lot of them.

So, how to implement it? Preferably, you should track the position and size of each object - I often use a bounding sphere or axis-aligned bounding box (AABB). Then you can use sphere-frustum or AABB-frustum tests to estimate if an object is inside or outside the frustum. To do that, you will need to iterate through each tracked object each frame and test them against each frustum plane - near, far, right, left, top, and bottom.

Sometimes it is too costly to iterate through each object, so there are some alternatives:

1. Store all tracked objects in the spatial-partitioning collections, like grid, quadtree/octree or BVH (tbh, BVH is often an overkill). Good material here: https://gameprogrammingpatterns.com/spatial-partition.html

2. Implement GPU culling in a compute shader, using an append buffer.

3. Multithreading / background thread.

// Example method that can be used for frustum culling    
// planes contain all the planes of a camera frustum.    
// bounds contain the bounds of a tested object.    
// Method returns true when object is inside the frustum    
public bool TestPlanesAABB(NativeArray planes, Bounds bounds)    
{      
for (int i = 0; i < planes.Length; i++)      
{      
 Plane plane = planes[i];      
 float3 normal_sign = math.sign(plane.normal);      
 float3 test_point = (float3)(bounds.center) + (bounds.extents * normal_sign);    
   
    float dot = math.dot(test_point, plane.normal);    
    if (dot + plane.distance < 0)    
        return false;    
}    
   
return true;    
}

:image-description:

Example function that tests a single AABB object against frustum planes.

5. Execute draw calls

Now you have a collection of objects you want to render into a texture. Now you should use a rendering API to execute a bunch of draw calls to render those objects into a texture.

:center-100:

DirectX 12 events executed by my custom texture rendering.

The most important skills to learn are:

How you inject the custom code into a render pipeline in the engine you're using (ex., Unity's Render Graph API in URP)
How to set your texture as a render target.
How you can execute draw calls (In Unity: DrawMesh, DrawMeshInstancedIndirect, DrawProcedural, DrawRenderer, etc.)
How to later use the rendered texture in other shaders.

6. Prepare shaders

Then, when all the CPU-side rendering is ready, you need to write shaders that will actually render the content into a texture. You should understand how to write shaders for each of the draw call commands you used.

:center-100:

This is my rendered top-down texture of enemy and player trails.

7. Use the rendered texture!

In the final step, bind the rendered texture to other draw calls and remember to use a view-projection matrix to calculate the correct UV to sample, and you're good to go.

float4x4 _HeatmapMatrixV;    
float4x4 _HeatmapMatrixP;    
Texture2D _HeatmapTexture;    
SamplerState linearClampSampler;
float2 GetHeightmapTextureUVfromWS(float3 positionWS)    
{    
   float4 positionCS = mul(_HeatmapMatrixP, mul(_HeatmapMatrixV, float4(positionWS.xyz, 1.0)));    
   positionCS /= positionCS.w;    
   float2 uv = positionCS.xy * 0.5 + 0.5;    
   return uv;    
}
float4 SampleHeightmapTexture(float2 uv)    
{    
   return _HeatmapTexture.SampleLevel(linearClampSampler, uv, 0.0);    
}

:image-description:

This is an example HLSL code that converts world space position into a texture UV using view and projection matrices.

It takes effort to implement rendering into a texture, but once you master this technique, it becomes a matter of a single day of work.

It will take longer to tweak other shaders to look as anticipated.

:center-100:

:image-description:

I used the texture to cut out the volumetric fog (red channel) and mess up the sand (green channel).

___

Summary

Rendering into a custom texture is the core technique every graphics programmer or technical artist needs to master. GPUs are built to slam triangles into textures fast, but we only get a few programmable stages. The real power lies in creating our own textures, rendering into them, and reusing them throughout the pipeline.

That’s the backbone of every render pipeline. Shadows, SSAO, deferred lighting, minimaps, trails, noise caching, instancing - they all boil down to the same principle: render something once, then feed it back into shaders.

To pull it off, you need to understand how to allocate textures, build view–projection matrices, track objects, cull them, fire draw calls, and write shaders that output to your custom target.

It’s a lot of moving parts at first, but if you internalize this, you’ll recognize it everywhere, and you’ll build faster. This is core graphics knowledge. Master it once, and you’ll see the same principle popping up everywhere.

___

I think about GPU programming as a data processing pipeline. Shaders are responsible for moving data from one buffer to another.

:center-100: