DeepSeek AI has introduced DeepGEMM, a library designed to enhance FP8 General Matrix Multiplication (GEMM) operations. The library supports both dense and Mix-of-Experts GEMMs, optimizing performance on NVIDIA Hopper tensor cores. DeepGEMM employs a Just-In-Time (JIT) compilation strategy to streamline integration and maximize

5m read timeFrom marktechpost.com
Post cover image
Table of contents
Technical Details and BenefitsPerformance Insights and ConsiderationsConclusion

Sort: