In the previous blog post we discussed a few potential neural network (MLP) applications in rendering and one of the conclusions was that although easy to implement, inference cost can be quite high, especially for larger networks which makes a compute shader implementation of it impractical in many cases. For that reason, specialised hardware has…

Interplay of Light is a blog or publication focused on photography, visual arts, and creative expression through light and imagery. Through articles, tutorials, and showcases of photography, Interplay of Light explores various aspects of photography techniques, composition, and storytelling. Whether you're a beginner or an experienced photographer, Interplay of Light offers inspiration, tips, and resources to enhance your skills and creativity in capturing captivating images.

Interplay of Light

A hands-on exploration of using DirectX 12 Cooperative Vectors (preview) to accelerate MLP inference via Nvidia Tensor cores in HLSL shaders. Covers setup requirements (Agility SDK 1.717.1-preview, DXC with SM6.9, Nvidia 590.26 driver), enabling experimental features, defining long vectors and MatrixRef/VectorRef types, weight conversion from float32 to float16, buffer alignment requirements (128-byte for weights, 64-byte for biases), and a complete 2-hidden-layer MLP implementation. Performance benchmarks on an RTX 3080 mobile show modest 2x speedups for small networks (3-3-3-3) but dramatic gains for larger ones: a 6-32-32-32-1 MLP achieves 41.7x speedup and a 6-64-64-64-1 MLP achieves 173x speedup over an unoptimized compute shader baseline. The feature in its current form will be superseded by the Linear Algebra Matrix spec targeting SM6.10.

Adventures in Neural Rendering part 2: Cooperative vectors