Google Cloud and NVIDIA announced a series of joint AI infrastructure expansions at GTC 2026. Key highlights include strong adoption of G4 VMs powered by NVIDIA RTX Pro 6000 Server Edition, a preview of fractional G4 VMs using NVIDIA vGPU technology (a first for this GPU line), and planned support for the upcoming NVIDIA Vera Rubin NVL72 platform in H2 2026. On the software side, NVIDIA Dynamo is being integrated with GKE Inference Gateway for open-source inference orchestration, and Vertex AI Training now supports A4X VM domains with GB200 NVL72 rack-scale systems plus proactive fault detection. Vertex AI Model Garden is expanding with NVIDIA Nemotron 3 models including a 120B reasoning model. A joint public sector AI startup accelerator program was also announced.

9m read timeFrom cloud.google.com
Post cover image
Table of contents
Accelerating AI workloads with G4 VMsIntroducing fractional G4 VMsScaling AI Hypercomputer with NVIDIA Vera Rubin NVL72Delivering efficiency across the AI infrastructure stackAdvancing Vertex AI training and Model GardenEmpowering public sector AI startupsCo-engineering collaboration powers every layer of the AI stack

Sort: