Are AI agents ready for the workplace? A new benchmark raises doubts.
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
New research from Mercor introduces Apex-Agents, a benchmark testing AI models on real white-collar tasks from consulting, investment banking, and law. Leading models achieved only 24% accuracy at best, with Gemini 3 Flash and GPT-5.2 performing strongest. The main challenge is multi-domain reasoning across tools like Slack and
Sort: