NVIDIA cuTENSOR 2.0 is a CUDA math library that provides optimized implementations of tensor operations for dense, multi-dimensional arrays. It offers improved functionality and performance, including just-in-time compilation capabilities. Developers can benefit from cuTENSOR in CUDA, Fortran, Python, and Julia. The library
Table of contents
cuTENSOR 2.0API introductionEinsum notationContractionElementwise operationsReductionJust-in-time compilationPlan cache and incremental autotuningMulti-GPU supportSummaryGet Started with cuTENSOR 2.0Sort: