We’re on a journey to advance and democratize artificial intelligence through open source and open science.

HuggingFace's platform is a resource for developers and researchers working in natural language processing (NLP) and machine learning, offering insights into NLP models, tools, and datasets. Through articles, tutorials, and open-source projects, HuggingFace offers insights into state-of-the-art NLP techniques, transformer architectures, and transfer learning methods. Developers can learn about using pre-trained models, fine-tuning strategies, and deploying NLP applications with HuggingFace's libraries and APIs.

Hugging Face

Hugging Face introduces Smol2Operator, a comprehensive approach for training lightweight vision-language models to perform GUI automation tasks. The methodology transforms a base model with zero grounding capabilities into an agentic GUI coder through a two-phase training process. Phase 1 establishes GUI grounding using action-instruction pairs, while Phase 2 develops agentic reasoning capabilities. The approach achieves 61% accuracy on ScreenSpot-v2 benchmark and includes complete open-source training recipes, datasets, preprocessing tools, and the resulting model to enable full reproducibility.

Smol2Operator: Post-Training GUI Agents for Computer Use

1. Data Transformation and Unified Action Space