GLM-OCR: Accurate ×  Fast × Comprehensive. Contribute to zai-org/GLM-OCR development by creating an account on GitHub.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

GLM-OCR is an open-source multimodal OCR model achieving state-of-the-art performance on document understanding benchmarks with only 0.9B parameters. Built on GLM-V architecture with Multi-Token Prediction loss, it excels at complex layouts including tables, formulas, and code. The project provides a comprehensive SDK supporting multiple deployment options: cloud API via Zhipu MaaS, self-hosted with vLLM/SGLang, or local deployment with Ollama/MLX. Features include layout detection via PP-DocLayout-V3, parallel region recognition, and outputs in both JSON and Markdown formats.

zai-org/GLM-OCR: GLM-OCR: Accurate × Fast × Comprehensive