Physical Intelligence presents their vision language action (VLA) models that enable robots to perform complex, dexterous tasks in real-world environments. Unlike traditional robotics limited to structured factory settings, their approach combines multimodal AI with robotics to create models that can generalize across different robots and environments. They've developed a comprehensive data collection pipeline using human teleoperation to gather training data, scaling from 3,800 hours to over 10,000 hours of successful robot demonstrations. Their latest model, PI-0.5, can perform long-horizon tasks up to 10 minutes in unseen homes, demonstrating true generalization capabilities. The company emphasizes that software intelligence, rather than hardware, is the main bottleneck in robotics scaling.
Sort: