Researchers from CMU and Writer.com introduce OmniACT, a dataset and benchmark for automating computer tasks. OmniACT combines visual and textual data to generate precise action scripts. Evaluation shows a disparity between autonomous agents and human efficiency. Future advancements in multimodal models are needed for better human-computer interaction.
•4m read time• From marktechpost.com
Sort: