A Blog post by IBM Research on Hugging Face

HuggingFace's platform is a resource for developers and researchers working in natural language processing (NLP) and machine learning, offering insights into NLP models, tools, and datasets. Through articles, tutorials, and open-source projects, HuggingFace offers insights into state-of-the-art NLP techniques, transformer architectures, and transfer learning methods. Developers can learn about using pre-trained models, fine-tuning strategies, and deploying NLP applications with HuggingFace's libraries and APIs.

Hugging Face

IBM Research launches the Open Agent Leaderboard, an open benchmark that evaluates full AI agent systems rather than just the underlying models. It combines six established benchmarks (SWE-Bench Verified, BrowseComp+, AppWorld, tau2-Bench variants) under a unified protocol and reports both quality and cost per task. Key findings: agent architecture already meaningfully impacts results beyond model choice, general-purpose agents are competitive with specialized ones, and failed runs cost 20–54% more than successful ones. The accompanying Exgentic framework lets anyone reproduce or submit evaluations, and everything is open-sourced from day one.

The Open Agent Leaderboard