Meta's research paper introduces Large Concept Models (LCMs), a new AI architecture that operates on sentence-level concept embeddings rather than subword tokens. Unlike traditional LLMs, LCMs use a component called SONAR to encode sentences into language-agnostic concept embeddings, enabling hierarchical reasoning across 200+ languages and multiple modalities. Three architectural variants are explored: Base-LCM (next-concept prediction via MSE loss), One-Tower diffusion LCM, and Two-Tower diffusion LCM. The diffusion-based variants significantly outperform the base version on summarization quality (ROUGE-L) and coherence metrics, though they still trail slightly behind a small LLaMA baseline. The approach draws conceptual parallels to Meta's earlier JEPA work and Yann LeCun's vision for human-like AI.
Sort: