Automate the lifecycle of your AI infrastructure. Cast AI now provides automated provisioning and scale-to-zero for TPU.

Cast AI is a platform offering insights, tutorials, and resources for cloud infrastructure and Kubernetes users. Readers can learn about cloud-native technologies, container orchestration, and infrastructure optimization. With tutorials, best practices, and case studies, Cast AI helps organizations optimize their cloud resources and streamline their Kubernetes deployments.

Cast AI

Cast AI now supports automated provisioning and scale-to-zero for Google Cloud TPU v5e and v5p slices on GKE. Teams can treat TPUs as standard compute resources managed through a single control plane alongside CPUs, GPUs, and AWS Trainium/Neuron. The integration requires no additional device plugins — pod specs with standard node selectors and tolerations trigger automatic TPU slice provisioning. Current limitations include single-host slices only and GKE exclusivity at launch, with broader topology support planned throughout the year.

Scale AI workloads on any hardware with Cast AI support for TPUs

Maximize utilization through automated hardware lifecycle management

Unified Automation for Specialized AI Hardware

How It Works: One Manifest, No New Plugins