Google's New TPU Quietly Ends the GPU Era?

This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).

Google announced its 8th generation TPU (Trillium/TPU v8) as two distinct chips: the TPU v8 training chip and the TPU v8i inference chip. The post explains from first principles why CPUs, GPUs, and TPUs differ architecturally, how systolic arrays make TPUs efficient for matrix multiplication, and why training and inference workloads stress hardware in fundamentally different ways. The TPU v8 training pod connects 9,600 chips delivering 121 exaflops, while the inference chip features 384MB of on-chip SRAM to reduce KV cache latency. Google's decision to split one generation into two specialized chips signals a broader industry shift away from Nvidia's one-chip-for-everything strategy.

10m watch time

Sort: