How Physical Intelligence runs remote, real-time, robotic inference on Modal.

Modal

Physical Intelligence (Pi) runs real-time remote inference for robotic control using Modal's cloud GPU infrastructure. Their Visual-Language-Action model requires continuous inference cycles across a fleet of robots. Rather than using standard TCP-based Modal Tunnels, Pi and Modal co-developed a QUIC-based transport over UDP with automatic NAT traversal, adding only 10-15ms of network overhead. This setup allows Pi to use larger data-center-class GPUs, load model checkpoints in under 30 seconds via Modal Volumes, and expand globally by pinning inference to regions close to their robots—without shipping hardware or building local clusters.

Physical Intelligence Runs Real-Time Remote Inference for Robotic Control on Modal

Designing for the constraints of real-time latency