CUDA 13.1 introduces a new single-call API for CUB (CUDA Core Compute Libraries) that simplifies GPU primitive algorithm usage by eliminating the traditional two-phase pattern of memory estimation and allocation. The new API manages memory allocation automatically under the hood with zero performance overhead, while introducing
•8m read time• From developer.nvidia.com
Table of contents
What is CUB?The existing CUB two-phase APIThe new single-call CUB APIGet started with CUBSort: