Kimi K2.5 is a new open-source vision language model with 1T total parameters (32.86B active) that supports text, image, and video inputs with a 262K context length. The model uses a mixture-of-experts architecture with 384 experts and achieves 3.2% parameter activation per token. Developers can access GPU-accelerated endpoints for free prototyping through build.nvidia.com, deploy using vLLM, or fine-tune with NVIDIA NeMo Framework and AutoModel for domain-specific tasks.
Table of contents
Build with NVIDIA GPU-accelerated endpointsDeploying with vLLMFine-tuning with NVIDIA NeMo FrameworkGet started with Kimi K2.5Sort: