Scaling VC-6 (SMPTE ST 2117-1) decoding for batched vision AI workloads requires architectural changes beyond single-image kernel tuning. Using NVIDIA Nsight Systems and Nsight Compute, the VC-6 CUDA implementation was redesigned from N separate decoder instances to a single batch decoder, shifting more work to the GPU,

9m read timeFrom developer.nvidia.com
Post cover image
Table of contents
Introducing the VC-6 batch mode implementationPerformance scaling and updated resultsGet started with VC-6 decoding

Sort: