NVIDIA announced the Nemotron 3 model family at GTC 2026, designed as a unified agentic AI stack. The lineup includes Nemotron 3 Super (120B hybrid MoE model with 1M-token context for reasoning and planning), Nemotron 3 Content Safety (4B multimodal safety classifier supporting 12 languages), Nemotron 3 VoiceChat (12B end-to-end full-duplex speech model targeting sub-300ms latency), Nemotron 3 Nano Omni (multimodal video/audio/document understanding, coming soon), and Nemotron RAG models (Embed VL and Rerank VL for visual document retrieval). All models are released under permissive open licenses. The NeMo toolkit provides evaluation and optimization tooling for production agentic pipelines.
Table of contents
Power multi-agent systems with NVIDIA Nemotron 3 SuperKeep agents safe with Nemotron 3 Content SafetyNatural conversations with Nemotron 3 VoiceChatUnderstand the world with NVIDIA Nemotron 3 OmniImprove multimodal search with Llama Nemotron Embed VL and Rerank VLEvaluate and optimize with NVIDIA NeMoStart building with NemotronSort: