NVIDIA cuTENSOR 2.0 is a CUDA math library that provides optimized implementations of tensor operations for dense, multi-dimensional arrays. It offers improved functionality and performance, including just-in-time compilation capabilities. Developers can benefit from cuTENSOR in CUDA, Fortran, Python, and Julia. The library

13m read timeFrom developer.nvidia.com
Post cover image
Table of contents
cuTENSOR 2.0API introductionEinsum notationContractionElementwise operationsReductionJust-in-time compilationPlan cache and incremental autotuningMulti-GPU supportSummaryGet Started with cuTENSOR 2.0

Sort: