First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of…

NVIDIA DevTalk serves as a vibrant community hub where developers can engage in discussions, seek assistance, and collaborate on projects involving NVIDIA hardware and software. Developers can tap into the collective expertise of the NVIDIA developer community, sharing insights, troubleshooting issues, and exploring best practices for GPU programming and AI development. Additionally, DevTalk provides a platform for developers to showcase their projects, receive feedback, and network with peers, fostering collaboration and knowledge exchange within the NVIDIA ecosystem.

NVIDIA Developer

NVIDIA Megatron-Core, an updated version of Megatron-LM, is a PyTorch-based library designed for efficient large-scale training of transformer models. It offers GPU-optimized techniques, modular APIs, and support for multimodal training. Key features include activation recomputation, distributed checkpointing, expert parallelism for MoEs, and improved scalability with various parallelism strategies. Megatron-Core is compatible with all NVIDIA Tensor Core GPUs and can utilize FP8 data format for better throughput. The library is available open-source and integrated into both Megatron-LM and NVIDIA NeMo frameworks.

Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities

Multimodal training is now supported in Megatron-Core

Training throughput optimization for mixture of experts

Fast distributed checkpointing for better training resiliency