Nvidia's new Vera Rubin architecture, announced at CES 2025, promises 10x lower inference costs and 4x fewer GPUs for training compared to Blackwell. While the Rubin GPU delivers 50 petaflops of 4-bit computation, the real innovation lies in six new chips working together through extreme co-design. The NVLink6 switch doubles bandwidth to 3,600 GB/s for GPU-to-GPU connections, while expanded in-network computing offloads operations from GPUs to switches, reducing latency and power consumption. The scale-out network uses ConnectX-9, BlueField-4, and Spectrum-6 Ethernet switches with co-packaged optics to minimize jitter when connecting racks. Nvidia sees connecting multiple data centers as the next frontier for distributed AI workloads exceeding 100,000 GPUs.

5m read timeFrom spectrum.ieee.org
Post cover image
Table of contents
Expanded “in-network compute”Scaling out and across

Sort: