We're building AI Gateway into a unified inference layer for AI, letting developers call models from 14+ providers. New features include Workers AI binding integration and an expanded catalog with multimodal models.

Cloudflare's platform is a leading provider of internet security and performance solutions, offering insights into web security, content delivery, and DNS management. Through documentation, blog posts, and webinars, Cloudflare provides insights into protecting websites and applications from cyber threats and improving performance. Developers and IT professionals can learn about CDN (Content Delivery Network), DDoS mitigation, and firewall configurations to secure and accelerate web traffic.

Cloudflare

Cloudflare is evolving AI Gateway into a unified inference layer that lets developers access 70+ models from 12+ providers through a single API and one set of credits. Key updates include: using the same AI.run() Workers binding to call third-party models (OpenAI, Anthropic, Google, etc.) with a one-line switch, centralized cost monitoring with custom metadata breakdowns, automatic failover routing when a provider goes down, and streaming response buffering for resilient long-running agents. Cloudflare is also enabling developers to bring their own fine-tuned models to Workers AI via Replicate's Cog containerization technology. The platform now includes multimodal models (image, video, speech) and large agent-optimized models like Kimi K2.5, all served from Cloudflare's 330-city global network to minimize latency.

Cloudflare’s AI Platform: an inference layer designed for agents

Built for reliability with automatic failover