NVIDIA GB200/GB300 NVL72 rack-scale supercomputers present a topology mismatch with traditional schedulers that treat GPUs as a flat pool. The post explains how cluster UUIDs and clique IDs encode NVLink domain membership, and how NVIDIA Mission Control bridges hardware topology to schedulers. For Slurm, the topology/block plugin maps NVLink partitions to scheduling blocks for locality-aware placement. For Kubernetes, the NVIDIA DRA GPU driver introduces ComputeDomains—first-class objects that group nodes sharing an NVLink/MNNVL fabric and manage IMEX lifecycle per workload. NVIDIA Run:ai automates topology detection, ComputeDomain creation, and topology-aware pod placement on top of Kubernetes. Topograph, an open-source tool, auto-discovers cluster topology and feeds it to schedulers, eliminating manual topology modeling.

9m read timeFrom developer.nvidia.com
Post cover image
Table of contents
The core challenge: Rack-scale topology meets AI workload schedulingScheduling Multi-Node NVLink workloads with SlurmIMEX management with Slurm: From rack-level service to per-job isolationExtending multi-node NVLink support to Kubernetes and NVIDIA Run:aiHow NVIDIA Run:ai simplifies distributed workloads on NVLink domainsAutomatic topology detection with TopographLearn more about advanced AI operations

Sort: