Cast AI now supports automated provisioning and scale-to-zero for Google Cloud TPU v5e and v5p slices on GKE. Teams can treat TPUs as standard compute resources managed through a single control plane alongside CPUs, GPUs, and AWS Trainium/Neuron. The integration requires no additional device plugins — pod specs with standard node selectors and tolerations trigger automatic TPU slice provisioning. Current limitations include single-host slices only and GKE exclusivity at launch, with broader topology support planned throughout the year.
Table of contents
Maximize utilization through automated hardware lifecycle managementUnified Automation for Specialized AI HardwareHow It Works: One Manifest, No New PluginsTechnical Scope and AvailabilityTake the Toil Out of AI InfrastructureSort: