IBM Granite releases two new Apache 2.0 multilingual embedding models built on ModernBERT: a 97M-parameter compact model (granite-embedding-97m-multilingual-r2) scoring 60.3 on MTEB Multilingual Retrieval — the best among open sub-100M models — and a 311M full-size model (granite-embedding-311m-multilingual-r2) scoring 65.2 (#2 among open models under 500M). Both support 200+ languages with enhanced quality for 52 languages, handle 32K-token context (64x over R1), and add code retrieval across 9 programming languages. The 311M model supports Matryoshka embeddings (768 down to 128 dims with minimal quality loss). Key improvements over R1 include a new ModernBERT architecture, Gemma 3 and GPT-OSS tokenizers, knowledge distillation from decoder LLMs, and a novel vocabulary pruning methodology for the compact model. Both are drop-in compatible with LangChain, LlamaIndex, Haystack, and Milvus, and ship with ONNX and OpenVINO weights for CPU inference.
Table of contents
Enterprise-Ready by DesignA Strong Sub-100M Multilingual ModelWhat Changed from R1Training the Full-Size 311M ModelBuilding the compact 97M Multilingual modelBenchmark ResultsMatryoshka Embeddings (311M)Deployment OptionsFor Framework IntegratorsWhich Model Should You Use?Try The ModelsSort: