Best of Hugging Face — 2025

1
Article
Hugging Face·1y
Tiny Agents in Python: a MCP-powered agent in ~70 lines of code
The post introduces a method to create MCP-powered agents in Python, highlighting a simplified setup for integrating external tools with large language models (LLMs). By using the Model Context Protocol (MCP), these agents can easily interact with various tools without custom integration. The guide details the setup and execution of such agents using the huggingface_hub, showcasing potential use cases and possible configurations. It emphasizes the role of the MCPClient in facilitating asynchronous connections to MCP servers, tool discovery, and execution.
212
1
2
Article
Hugging Face·1y
Tiny Agents: a MCP-powered agent in 50 lines of code
Discover how to implement a small and powerful AI agent using Model Context Protocol (MCP) in just 50 lines of code. The post covers the integration of MCP with large language models (LLMs) to create agentic AI, featuring JavaScript and TypeScript components with Hugging Face's SDKs and tools. It also demonstrates the use of MCP servers and shows how tools can be utilized within an LLM inference client.
182
2
3
Article
Hacker News·1y
Run DeepSeek-R1 Dynamic 1.58-bit
The post explains how to install and run the DeepSeek-R1 model, highlighting the importance of adding BOS and EOS tokens in interactions. It provides detailed setup instructions using commands like `apt-get update` for dependencies, downloading the model via `huggingface_hub`, and outlines how to configure GPU offloading based on available memory. Additionally, there's guidance on quantizing the model's K cache to 4bit and running the model using those configurations.
47
3
4
Article
Hugging Face·40w
MCP for Research: How to Connect AI to Research Tools
Model Context Protocol (MCP) enables AI systems to automate academic research discovery by connecting to tools that search across platforms like arXiv, GitHub, and Hugging Face. The approach progresses through three abstraction layers: manual research, scripted automation, and AI-orchestrated natural language workflows. MCP allows researchers to use natural language requests to gather comprehensive information about papers, implementations, and related resources, though it requires human oversight for quality control.
44
5
Article
Hugging Face·51w
ScreenSuite - The most comprehensive evaluation suite for GUI Agents!
ScreenSuite is a comprehensive evaluation framework for GUI agents that unifies 13 benchmarks across perception, grounding, single-step actions, and multi-step agent capabilities. The suite evaluates vision language models on their ability to interact with graphical interfaces using only visual input, without accessibility trees or DOM metadata. It includes Dockerized environments for Ubuntu and Android testing, supports both local and remote sandbox execution, and provides standardized evaluation of leading VLMs like Qwen-2.5-VL series, UI-TARS, and GPT-4o on GUI automation tasks.
41
6
Article
Hugging Face·43w
Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face
Hugging Face introduces Trackio, a lightweight open-source Python library for machine learning experiment tracking. It offers wandb-compatible API, local-first approach with optional Hugging Face Spaces hosting, easy sharing via URLs and iframes, and built-in GPU energy usage tracking. The library integrates seamlessly with Transformers and Accelerate, stores data in SQLite with Parquet backups, and provides free hosting on Hugging Face Spaces with both public and private options.
32
7
Article
Hugging Face·30w
huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning
The huggingface_hub Python library has reached v1.0 after five years of development, now powering 200,000 dependent libraries and providing access to over 2 million models, 500,000 datasets, and 1 million Spaces. Major changes include migration from requests to httpx for modern HTTP infrastructure, a redesigned CLI replacing huggingface-cli with expanded features, and full adoption of hf_xet for file transfers with chunk-level deduplication. The release removes legacy patterns like the Git-based Repository class while maintaining backward compatibility for most ML libraries, though transformers v5 will be required for full v1.x support.
31
8
Article
Daily Dose of Data Science | Avi Chawla | Substack·40w
Fine-tuning Gemma 3 270M Locally
Google's Gemma 3 270M model can be fine-tuned locally using just 0.5 GB RAM. The tutorial demonstrates using Unsloth and HuggingFace transformers to fine-tune the model for chess move prediction. The process involves loading the model, configuring LoRA for efficient training, preparing a chess dataset, and training with decreasing loss. After fine-tuning, the model successfully predicts missing chess moves instead of generating random moves.
28
9
Video
Sam Witteveen·1y
How to make Muilt-Agent Apps with smolagents
Learn how to build multi-agent applications using Huggingface's smolagents. This guide covers various tests and explanations, including integrating models like Alama, Claude, Gemini, Gradio, and OpenAI. It discusses how to set up and use small agents for tool calling and coding tasks, demonstrating the differences between small and proprietary models for these functions. Additionally, explore the use of the Gradio interface, creating custom tools, and running multi-agent systems effectively.
28
10
Article
Hugging Face·35w
Gaia2 and ARE: Empowering the community to study agents
Hugging Face introduces Gaia2, an advanced AI agent benchmark that goes beyond read-only tasks to evaluate interactive behaviors in real-world conditions. Unlike its predecessor GAIA, Gaia2 tests agents on complex scenarios including ambiguity handling, time-sensitive actions, and noise tolerance using a smartphone mock-up environment. The release includes the open-source Agent Research Environments (ARE) framework for running, debugging, and evaluating agents with structured trace recording. Current results show GPT-5 as the top performer, while temporal reasoning remains challenging for all models. The platform enables researchers to create custom scenarios and connect their own tools via MCP integration.
24
11
Article
Hugging Face·1y
Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs
AutoRound is Intel's advanced post-training quantization tool for large language and vision-language models, designed to reduce model size and inference latency while maintaining high accuracy. It utilizes signed gradient descent to optimize weight rounding and clipping ranges for low-bit quantization (e.g., INT2 - INT8) with minimal accuracy loss. The tool supports a variety of model architectures and devices, and offers fast quantization processes with just a small calibration dataset needed. AutoRound is compatible with popular export formats and provides flexibility in quantization configurations.
24
12
Article
Towards AI·1y
My 6 Secret Tips for Getting an ML Job in 2025
Landing a machine learning job in 2025 can be challenging but knowing certain 'secret' tips can help. The author shares key strategies, such as demonstrating skills through personal projects and identifying opportunities for improvement in existing code, to stand out to potential employers.
18
13
Article
Hugging Face·46w
Upskill your LLMs with Gradio MCP Servers
The Model Context Protocol (MCP) enables developers to extend Large Language Models with specialized tools and capabilities. Gradio apps on Hugging Face Spaces now support MCP, creating an "app store" of thousands of AI-powered tools that can be connected to LLMs. The post demonstrates how to integrate the Flux.1 Kontext image editing model as an MCP server with Cursor, allowing the LLM to edit images from text prompts. This approach transforms LLMs from simple question-answering systems into powerful assistants with diverse capabilities like image editing, web browsing, and data processing.
16
1
14
Article
Javarevisited·1y
7 Best Udemy Courses to Learn Generative AI with ChatGPT, LangChain and Huggingface in 2025
Generative AI is transforming industries by enabling the creation of text, images, music, code, and videos without requiring specialized expertise. Accessible platforms like Udemy offer courses to help individuals learn how to build and utilize these AI tools, such as ChatGPT, LangChain, and Huggingface, through practical, project-based learning. These courses are updated to keep pace with AI's rapid evolution, helping learners acquire foundational and advanced skills in AI creativity, engineering, and application development.
15
15
Article
Daily Dose of Data Science | Avi Chawla | Substack·38w
Build a Reasoning LLM using GRPO
Group Relative Policy Optimization (GRPO) is a reinforcement learning method that fine-tunes large language models for math and reasoning tasks using deterministic reward functions, eliminating the need for labeled data. The process involves generating multiple candidate responses, assigning rewards based on deterministic functions, and using GRPO loss to update the model through backpropagation. A practical implementation demonstrates using UnslothAI and HuggingFace TRL to transform a base model into a reasoning-capable system, with reward functions that validate response format and correctness without manual labeling.
13
16
Article
Hugging Face·1y
The Transformers Library: standardizing model definitions
The Transformers library aims to be the central hub for model architectures across various frameworks, supporting over 300 models with consistent updates. It integrates with major training frameworks and inference engines, offering significant interoperability and efficiency. Efforts are underway to simplify model definitions and contributions to reduce complexity for model creators, enhancing ecosystem standardization.
12
17
Video
Hacker News·1y
Deep Dive into LLMs like ChatGPT
Large language models (LLMs) such as ChatGPT are built through a complex pre-training process involving the downloading and processing of large quantities of diverse, high-quality internet texts. Common Crawl data, along with filtering steps like URL filtering, text extraction, and language filtering, are critical components. Tokenization converts these texts into a sequence of symbols for neural networks to process. These networks are trained to model the statistical relationships between tokens to predict the next token in a sequence. Inference is generating new data from the trained model by predicting subsequent tokens based on a given input.
12
18
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
Step-by-step Guide to Fine-tune Qwen3
Alibaba has launched the next version of its large language model, Qwen 3. This tutorial guides readers on fine-tuning Qwen 3 using the Unsloth framework, employing techniques such as LoRA configuration, dataset preparation in conversational format, and step-by-step solutions for effective model training. The tutorial also includes running inference using the HuggingFace transformers library.
10

See all Hugging Face archives