Google announced its eighth generation TPUs at Google Cloud Next, introducing two purpose-built chips: TPU 8t for training and TPU 8i for inference. TPU 8t scales to 9,600 chips per superpod with 121 ExaFlops of compute and near-linear scaling to one million chips via the new Virgo Network. TPU 8i targets low-latency agentic inference with 288 GB HBM, 384 MB on-chip SRAM (3x previous gen), doubled ICI bandwidth at 19.2 Tb/s, and a new Boardfly topology that cuts network diameter by 50%. Both chips deliver up to 2x better performance-per-watt over the previous Ironwood generation, run on Google's Axion ARM-based CPUs, and support JAX, PyTorch, SGLang, and vLLM. General availability is expected later in 2025 as part of Google's AI Hypercomputer platform.
Sort: