Google Cloud announced a major expansion of its AI Hypercomputer infrastructure at Google Cloud Next. Key highlights include eighth-generation TPUs: TPU 8t for training (121 exaflops, 9,600 chips per superpod, ~3x performance over prior gen) and TPU 8i for inference/RL (384 MB on-chip SRAM, 80% better perf/dollar). Other announcements include A5X bare metal instances with NVIDIA Vera Rubin NVL72, Axion N4A CPU VMs, the Virgo Network fabric (4x bandwidth, supports 134K TPUs or 80K GPUs in a single data center), Google Cloud Managed Lustre at 10 TB/s bandwidth, GKE improvements with 4x faster node startup and 80% faster pod startup, native PyTorch support for TPUs (TorchTPU), and the llm-d distributed LLM inference framework now a CNCF Sandbox project.
Table of contents
Fueling agentic logic and reinforcement learning with Axion, Intel, and AMDVirgo Network for data center scale-out fabricStorage: Minimizing data bottlenecksGKE: Orchestration for agent-native workloadsSort: