Embodied Question Answering is the task of understanding an environment well enough to answer questions about it in natural language. OpenEQA is a benchmark dataset for EQA that contains over 1600 high-quality human generated questions. State-of-the-art foundation models like GPT-4V lag behind human-level performance on this dataset.
Sort: