Explores advanced task and motion planning (TAMP) systems that integrate perception with robot manipulation for long-horizon tasks. Covers OWL-TAMP, VLM-TAMP, and NOD-TAMP frameworks that use vision-language models to translate natural language instructions into robot actions. Introduces cuTAMP for GPU-accelerated planning that solves complex manipulation tasks in seconds, and Fail2Progress framework that enables robots to learn from failures using Stein variational inference. These approaches address traditional TAMP limitations by enabling robots to adapt to dynamic environments and handle multi-step tasks with 30-50 sequential actions.

6m read timeFrom developer.nvidia.com
Post cover image
Table of contents
How task and motion planning transforms vision and language into robot actionHow cuTAMP accelerates robot planning with GPU parallelizationHow robots learn from failures using Stein variational inferenceGetting startedAcknowledgments

Sort: