AI applications are moving beyond text generation to multimodal systems that can perceive, search, and reason across images, documents, video…

NVIDIA DevTalk serves as a vibrant community hub where developers can engage in discussions, seek assistance, and collaborate on projects involving NVIDIA hardware and software. Developers can tap into the collective expertise of the NVIDIA developer community, sharing insights, troubleshooting issues, and exploring best practices for GPU programming and AI development. Additionally, DevTalk provides a platform for developers to showcase their projects, receive feedback, and network with peers, fostering collaboration and knowledge exchange within the NVIDIA ecosystem.

NVIDIA Developer

Step 3.7 Flash, a 198B-parameter Mixture-of-Experts vision-language model from StepFun, is now available on NVIDIA-accelerated infrastructure. With ~11B active parameters per forward pass, native image/video input, three reasoning levels, and a 256k context window, it targets enterprise use cases like financial analysis and concurrent coding agents. Developers can deploy it via SGLang, TensorRT-LLM, or vLLM, or use NVIDIA NIM containerized microservices for production. An NVFP4-quantized checkpoint is available on Hugging Face. Fine-tuning is supported through NVIDIA NeMo Automodel (SFT and LoRA at 600 tokens/sec on Hopper GPUs) and NeMo Megatron-Bridge for large-scale training. NVIDIA DGX Station is highlighted for local development with 748 GB coherent memory.

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI

Production-ready deployment with NVIDIA NIM

Day 0 fine-tuning with NVIDIA NeMo Framework