Recent advancements in AI have led researchers to develop MLGym, a new framework and benchmark for evaluating and developing LLM agents in AI research. This system supports a wide range of tasks in computer vision, NLP, RL, and game theory, aiming to standardize the assessment of AI capabilities. The MLGym-Bench includes 13
Sort: