Google's seventh-generation Ironwood TPU introduces several hardware and software co-design features for training trillion-parameter models. Key optimization strategies include: native FP8 support in MXUs for up to 2x throughput over BF16 using the Qwix library; Tokamax high-performance JAX kernels featuring Splash Attention
Table of contents
Key optimization strategies for IronwoodThe Ironwood advantage: System-level performanceSort: