OSWorld is a groundbreaking platform that provides a realistic, scalable testing environment for autonomous computer agents. It supports task setup, evaluation, and interactive learning, allowing agents to freely interact with any application installed on the system. The platform includes a curated benchmark of real-world computer tasks, highlighting the deficiencies of current language models and vision-language models. OSWorld paves the way for research in enhancing GUI interaction, agent architectures, safety challenges, and expanding data and environments for agent development.
1 Comment
Sort: