Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

IBM Granite releases two new Apache 2.0 multilingual embedding models built on ModernBERT: a 97M-parameter compact model (granite-embedding-97m-multilingual-r2) scoring 60.3 on MTEB Multilingual Retrieval — the best among open sub-100M models — and a 311M full-size model (granite-embedding-311m-multilingual-r2) scoring 65.2 (#2 among open models under 500M). Both support 200+ languages with enhanced quality for 52 languages, handle 32K-token context (64x over R1), and add code retrieval across 9 programming languages. The 311M model supports Matryoshka embeddings (768 down to 128 dims with minimal quality loss). Key improvements over R1 include a new ModernBERT architecture, Gemma 3 and GPT-OSS tokenizers, knowledge distillation from decoder LLMs, and a novel vocabulary pruning methodology for the compact model. Both are drop-in compatible with LangChain, LlamaIndex, Haystack, and Milvus, and ship with ONNX and OpenVINO weights for CPU inference.

#rag

#embeddings

May 14•16m read time•From huggingface.co

Table of contents

Enterprise-Ready by Design A Strong Sub-100M Multilingual Model What Changed from R1 Training the Full-Size 311M Model Building the compact 97M Multilingual model Benchmark Results Matryoshka Embeddings (311M)Deployment Options For Framework Integrators Which Model Should You Use?Try The Models

Comment

Bookmark

Copy

Sort: