Nvidia's new Vera Rubin architecture, announced at CES 2025, promises 10x lower inference costs and 4x fewer GPUs for training compared to Blackwell. While the Rubin GPU delivers 50 petaflops of 4-bit computation, the real innovation lies in six new chips working together through extreme co-design. The NVLink6 switch doubles bandwidth to 3,600 GB/s for GPU-to-GPU connections, while expanded in-network computing offloads operations from GPUs to switches, reducing latency and power consumption. The scale-out network uses ConnectX-9, BlueField-4, and Spectrum-6 Ethernet switches with co-packaged optics to minimize jitter when connecting racks. Nvidia sees connecting multiple data centers as the next frontier for distributed AI workloads exceeding 100,000 GPUs.
Sort: