Meta details the rapid evolution of its in-house MTIA (Meta Training and Inference Accelerator) chip family, developed with Broadcom. Four generations — MTIA 300, 400, 450, and 500 — were shipped in under two years, with HBM bandwidth increasing 4.5x and compute FLOPS increasing 25x from MTIA 300 to 500. The strategy centers on three pillars: high velocity through modular chiplet architecture enabling ~6-month release cadence, inference-first optimization targeting GenAI workloads, and frictionless adoption via PyTorch-native software stack with vLLM, Triton, and OCP standards. The software stack includes custom compilers built on Torch FX IR, TorchInductor, MLIR, and LLVM, a dedicated collective communications library (HCCL), and a Rust-based user-space driver and firmware. MTIA chips are deployed at scale to power AI experiences for billions of Meta platform users.
Sort: