We’re on a journey to advance and democratize artificial intelligence through open source and open science.

HuggingFace's platform is a resource for developers and researchers working in natural language processing (NLP) and machine learning, offering insights into NLP models, tools, and datasets. Through articles, tutorials, and open-source projects, HuggingFace offers insights into state-of-the-art NLP techniques, transformer architectures, and transfer learning methods. Developers can learn about using pre-trained models, fine-tuning strategies, and deploying NLP applications with HuggingFace's libraries and APIs.

Hugging Face

A detailed walkthrough of converting the dots.ocr model (a 3B parameter OCR model from RedNote) to run on Apple devices using Core ML and MLX. The guide covers the conversion process from PyTorch to Core ML, including simplifying the model architecture, debugging common conversion errors, and initial benchmarking. Key challenges addressed include handling attention implementations, fixing dtype mismatches, removing dynamic control flow, and dealing with variable-length sequence masking. The converted model initially runs on GPU in FLOAT32 precision, with future parts promising Neural Engine optimization and quantization techniques.

SOTA OCR with Core ML and dots.ocr

Step 0: Understand and simplify the model