Google has made its seventh-generation TPU, Ironwood, generally available at Cloud Next 2026. The chip delivers 4.6 petaFLOPS of FP8 compute per chip, 192GB of HBM3e memory, and 42.5 exaFLOPS in a 9,216-chip superpod — over 24x the capacity of the world's most powerful supercomputer. It is purpose-built for inference workloads including LLM serving, MoE architectures, and diffusion models. Google also previewed its eighth-generation TPU split: TPU 8t (Sunfish), a Broadcom-designed training chip, and TPU 8i (Zebrafish), a MediaTek-designed inference chip — both on TSMC 2nm, targeting late 2027. This is the first time Google has purpose-built separate chips for training and inference. Anthropic, whose compute deal has expanded to 3.5 gigawatts in 2027, is the anchor customer for both generations. Google projects 35 million TPU shipments by 2028, backed by up to $185 billion in annual infrastructure spending.
Table of contents
Why inference, why nowTwo chips for the eighth generationThe supply chain behind the chipsThe customer that matters mostThe Nvidia questionSort: