TheRegister's platform is a leading technology news website, offering insights into IT industry news, hardware reviews, and software updates. Through articles, analysis, and opinion pieces, TheRegister offers insights into cybersecurity threats, technology trends, and industry developments. Readers can stay updated with the latest news and analysis from the world of technology and IT business.

The Register

Perplexity developed software optimizations enabling trillion-parameter mixture of experts (MoE) models to run efficiently on older, cheaper GPU hardware using AWS Elastic Fabric Adapter (EFA). The new kernels address memory and network latency challenges by optimizing communication between GPUs across multiple nodes, achieving lower latency than existing solutions like DeepSeek's DeepEP. Testing on AWS H200 instances with models like DeepSeek V3 and Kimi K2 showed substantial performance gains at medium batch sizes, making distributed inference on MoE models viable without expensive GB200/GB300 systems. The optimizations are open-sourced on GitHub.

How Perplexity optimized 1T parameter AI models for AWS EFA