An overview of Google’s eighth generation TPUs, built for the agentic era.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

Google announced its eighth generation TPUs at Google Cloud Next, introducing two purpose-built chips: TPU 8t for training and TPU 8i for inference. TPU 8t scales to 9,600 chips per superpod with 121 ExaFlops of compute and near-linear scaling to one million chips via the new Virgo Network. TPU 8i targets low-latency agentic inference with 288 GB HBM, 384 MB on-chip SRAM (3x previous gen), doubled ICI bandwidth at 19.2 Tb/s, and a new Boardfly topology that cuts network diameter by 50%. Both chips deliver up to 2x better performance-per-watt over the previous Ironwood generation, run on Google's Axion ARM-based CPUs, and support JAX, PyTorch, SGLang, and vLLM. General availability is expected later in 2025 as part of Google's AI Hypercomputer platform.

Our eighth generation TPUs: two chips for the agentic era