The Azure Kubernetes Service (AKS) team at Microsoft has shared guidance for running Anyscale's managed Ray service at scale. They focus on three key issues: GPU capacity limits, scattered ML storage,

InfoQ is a leading online platform for software developers, architects, and technical leaders, providing news, articles, presentations, and interviews on a wide range of topics, including agile practices, DevOps, microservices, and emerging technologies. With a focus on quality content and expert insights, InfoQ helps professionals stay informed about the latest trends, best practices, and industry developments. Developers can learn from real-world experiences, gain  knowledge, and connect with peers in the global software community through InfoQ's diverse and engaging content.

InfoQ

Microsoft's AKS team has published guidance for running Anyscale's managed Ray service at scale on Azure Kubernetes Service, addressing three core operational challenges. For GPU scarcity, they recommend a multi-cluster, multi-region setup using Azure Arc to aggregate quota and reroute workloads. For scattered ML storage, they propose Azure BlobFuse2 to mount Blob Storage as a POSIX filesystem into Ray worker pods, enabling shared data access with local caching. For credential expiry, the new approach uses Microsoft Entra service principals with AKS workload identity to issue short-lived tokens automatically, eliminating manual rotation. The integration is currently in private preview. Notably, AWS and Google Cloud have made similar Anyscale partnerships, suggesting the industry is converging on Kubernetes-plus-Ray for AI workloads, with competition shifting to which cloud best streamlines the surrounding infrastructure.