MATHVISTA is a benchmark introduced to assess the mathematical reasoning abilities of Large Language Models (LLMs) and Large Multimodal Models (LMMs) within visual contexts. The benchmark reveals a performance gap compared to human capabilities and emphasizes the need for further advancement. It combines diverse mathematical and graphical challenges and aims to enhance mathematical reasoning capabilities for real-world applications.
Sort: