HuggingFace's platform is a resource for developers and researchers working in natural language processing (NLP) and machine learning, offering insights into NLP models, tools, and datasets. Through articles, tutorials, and open-source projects, HuggingFace offers insights into state-of-the-art NLP techniques, transformer architectures, and transfer learning methods. Developers can learn about using pre-trained models, fine-tuning strategies, and deploying NLP applications with HuggingFace's libraries and APIs.

Hugging Face

NVIDIA introduces Nemotron 3 Nano 4B, a 4-billion-parameter hybrid Mamba-Transformer language model optimized for edge deployment on NVIDIA Jetson, GeForce RTX, and DGX Spark platforms. The model was compressed from the 9B Nemotron Nano v2 using the Nemotron Elastic structured pruning and distillation framework, which jointly trains a router to determine optimal pruning across depth, hidden dimensions, FFN channels, and Mamba heads. Post-training involved two-stage SFT, followed by a three-stage RL pipeline targeting instruction following and tool-calling. Released in BF16, FP8, and Q4_K_M GGUF formats, the model achieves state-of-the-art instruction following and gaming agency benchmarks in its size class, with the lowest VRAM footprint. FP8 quantization delivers up to 1.8x latency/throughput improvement on DGX Spark and Jetson Thor, while GGUF achieves 18 tokens/s on Jetson Orin Nano 8GB.

Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI