Google announced its eighth generation TPUs at Google Cloud Next, introducing two purpose-built chips: TPU 8t for training and TPU 8i for inference. TPU 8t scales to 9,600 chips per superpod with 121 ExaFlops of compute and near-linear scaling to one million chips via the new Virgo Network. TPU 8i targets low-latency agentic inference with 288 GB HBM, 384 MB on-chip SRAM (3x previous gen), doubled ICI bandwidth at 19.2 Tb/s, and a new Boardfly topology that cuts network diameter by 50%. Both chips deliver up to 2x better performance-per-watt over the previous Ironwood generation, run on Google's Axion ARM-based CPUs, and support JAX, PyTorch, SGLang, and vLLM. General availability is expected later in 2025 as part of Google's AI Hypercomputer platform.

8m read timeFrom blog.google
Post cover image
Table of contents
Two chips to meet the momentTPU 8t: The training powerhouse

Sort: