AMD's new inference engine, MK1 Flywheel, unlocks the full potential of AMD Instinct for LLM inference. It achieves comparable performance to compute-matched NVIDIA GPUs, with benchmarks showing up to 3.7x higher throughput compared to vLLM. AMD Instinct enters the AI market with its advanced CDNA 3 architecture, challenging NVIDIA's dominance. MK1 Flywheel has unparalleled throughput vs latency characteristics and offers seamless integration into enterprise stacks. Clients considering using AMD Instinct for LLM inference workloads at scale are encouraged to reach out.
Table of contents
Our Journey: Building out the HardwareOur Journey: Building out the SoftwareResults: MK1 Flywheel on AMD Instinct MI210 and MI100Sort: