CUDA 13.1 introduces a new single-call API for CUB (CUDA Core Compute Libraries) that simplifies GPU primitive algorithm usage by eliminating the traditional two-phase pattern of memory estimation and allocation. The new API manages memory allocation automatically under the hood with zero performance overhead, while introducing

8m read time From developer.nvidia.com
Post cover image
Table of contents
What is CUB?The existing CUB two-phase APIThe new single-call CUB APIGet started with CUB

Sort: