Meta partners with Cerebras to launch its new Llama API, offering developers AI inference speeds up to 18 times faster than traditional GPU solutions, challenging OpenAI and Google in the fast-growing AI services market.

VentureBeat is a leading source of news, analysis, and insights on technology innovation, startups, and venture capital. Covering topics such as AI, blockchain, gaming, and more, VentureBeat provides  reporting, interviews, and commentary on trends and developments shaping the tech industry. Entrepreneurs, investors, and technology enthusiasts can stay informed about the latest news, funding rounds, and market trends through VentureBeat's coverage.

Venture Beat

Meta has partnered with Cerebras Systems to launch the Llama API, offering AI inference speeds up to 18 times faster than traditional GPU solutions, positioning itself as a competitor in the AI services market against OpenAI and Google. The partnership utilizes Cerebras' specialized AI chips, enabling Meta's Llama models to process 2,600 tokens per second, vastly improving applications like real-time agents and interactive code generation. The API marks Meta's shift from solely providing open-source models to offering commercial AI services, leveraging its vast developer ecosystem and data centers across North America.

Meta unleashes Llama API running 18x faster than OpenAI: Cerebras partnership delivers 2,600 tokens per second

Breaking the speed barrier: How Cerebras supercharges Llama models

From open source to revenue stream: Meta’s AI business transformation

Inside Cerebras’ North American data center network powering Meta’s AI ambitions

Disrupting the AI ecosystem: How Meta’s 20x performance leap changes the game

How developers can access Meta’s ultra-fast Llama models today