Best of Hugging Face — 2024

1
Article
Hugging Face·2y
Llama can now see and run on your device - welcome Llama 3.2
Llama 3.2, developed in collaboration with Meta and available on Hugging Face, includes both multimodal vision models and text-only models. The Vision models come in 11B and 90B sizes and feature strong visual reasoning capabilities. Text-only models are available in 1B and 3B sizes, optimized for on-device use. Llama 3.2 also introduces a new version of Llama Guard for input classification, including harmful prompt detection. Integration with Hugging Face Transformers and major cloud services is supported, and fine-tuning can be accomplished with a single GPU.
127
10
2
Article
Hugging Face·1y
Visualize and understand GPU memory in PyTorch
This tutorial explains how to visualize and understand GPU memory usage in PyTorch during model training. It provides step-by-step instructions on generating and interpreting memory profiles using PyTorch's built-in tools. The tutorial also covers how to estimate and optimize memory requirements for training large models, offering practical tips to manage GPU memory efficiently.
66
3
Article
Hugging Face·1y
Hugging Face + PyCharm
The post discusses a new integration between Hugging Face and PyCharm, enabling developers to easily incorporate state-of-the-art machine learning models into their applications. By using features like 'Insert HF Model', one can easily add models for tasks like image-text-to-text chatting directly in PyCharm. The post also highlights the model card feature, local model cache, and the advantages of open-source AI models in development workflow.
50
1
4
Article
Hugging Face·2y
Fine-tuning LLMs to 1.58bit: extreme quantization made easy
As large language models (LLMs) grow, reducing their computational and energy costs via quantization becomes crucial. BitNet, a new transformer architecture from Microsoft Research, drastically cuts computational costs by representing parameters with ternary values (-1, 0, 1) at 1.58 bits per parameter. The post details how existing models, like Llama3, can be fine-tuned using BitNet, achieving efficient performance while maintaining accuracy. The article also covers the implementation, optimization, and benchmarking of custom inference kernels, making LLMs more scalable and practical.
40
1
5
Article
Hugging Face·2y
Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI
Learn how to deploy Meta Llama 3.1 405B on Google Cloud's Vertex AI using Hugging Face Deep Learning Containers. The post covers setup requirements, Google Cloud configuration, model registration, deployment processes, online prediction, and resource cleanup to avoid unnecessary costs.
36
6
Article
Hugging Face·2y
Exploring the Daily Papers Page on Hugging Face
Hugging Face's Daily Papers page helps AI developers and researchers stay updated with top research. Users can claim their papers, submit new ones, chat with authors, and access related resources all in one place. It also offers support through upvotes, related paper recommendations, and multilingual comments, enhancing global collaboration. Subscribers receive daily updates on the latest papers.
32
7
Article
Hugging Face·1y
Introducing the Synthetic Data Generator - build Datasets with Natural Language
The Synthetic Data Generator is an intuitive, no-code tool that allows users to create custom datasets using Large Language Models (LLMs). It simplifies the dataset creation process into three easy steps: describing the dataset, configuring and refining it, and generating the final dataset. This tool supports text classification and chat datasets and leverages the free Hugging Face API for its operations. Users can also train models without coding using AutoTrain. Advanced features include enhancing speed and accuracy, local deployment, and customizing synthetic data pipelines using open-source frameworks.
31
1
8
Article
Hugging Face·2y
Scaling AI-Based Data Processing with Hugging Face + Dask
Hugging Face offers many datasets and pre-trained models but scaling AI tasks can be challenging due to large dataset sizes and computational demands. This guide demonstrates how to use Dask, a Python library for distributed computing, to efficiently handle large datasets and scale model inference tasks. The example covers processing 100 rows locally with pandas and scaling to 211 million rows using Dask across multiple GPUs on the cloud, with additional tips on setting up the environment and deploying on the cloud using Coiled.
16
9
Article
Hugging Face·2y
Introducing the SQL Console on Datasets
Hugging Face has introduced an SQL Console for querying datasets directly in the Hub, powered by DuckDB WASM. This browser-based tool allows for local queries without dependencies, supports full DuckDB syntax, and enables result exportation to Parquet files. Designed to manage even large datasets, it simplifies tasks like converting data formats and performing complex queries efficiently.
12
10
Article
Hugging Face·2y
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
WebSight dataset aims at building AI systems capable of transforming screenshots to HTML code. It provides a large synthetic dataset of screenshot/HTML code pairs. The dataset has been updated to WebSight-v0.2 with significant improvements.
12
1
11
Article
Hugging Face·2y
Training and Finetuning Embedding Models with Sentence Transformers v3
The post provides an update on the Sentence Transformers library and explains how to use it to train and finetune embedding models. It discusses the importance of finetuning and the components involved in training, such as datasets, loss functions, training arguments, evaluators, and the trainer itself.
11

See all Hugging Face archives