With the release of our new inference engine MK1 Flywheel, we are excited 
to report that AMD Instinct Series can achieve comparable performance to a 
compute-matched NVIDIA GPU. We've designed MK1 Flywheel for maximum 
performance on AMD and NVIDIA chips: our benchmarks demonstrate up to 3.7x 
higher throughput compared to vLLM.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

AMD's new inference engine, MK1 Flywheel, unlocks the full potential of AMD Instinct for LLM inference. It achieves comparable performance to compute-matched NVIDIA GPUs, with benchmarks showing up to 3.7x higher throughput compared to vLLM. AMD Instinct enters the AI market with its advanced CDNA 3 architecture, challenging NVIDIA's dominance. MK1 Flywheel has unparalleled throughput vs latency characteristics and offers seamless integration into enterprise stacks. Clients considering using AMD Instinct for LLM inference workloads at scale are encouraged to reach out.

MK1 Flywheel Unlocks the Full Potential of AMD Instinct for LLM Inference — MK 1

Results: MK1 Flywheel on AMD Instinct MI210 and MI100