DeepSpeed trained the world's most powerful language models (MT-530B, BLOOM) Azure and DeepSpeed empower easy-to-use and high-performance model training. DeepSpeed is an important part of Microsoft’s new AI at Scale initiative to enable next-generation AI capabilities at scale. It is not tied to specific PyTorch or CUDA versions.

7m read timeFrom github.com
Post cover image
Table of contents
Extreme Speed and Scale for DL Training and InferenceDeepSpeed's three innovation pillarsDeepSpeed Software SuiteDeepSpeed AdoptionBuild Pipeline StatusInstallationFeaturesFurther ReadingContributingPublicationsVideos

Sort: