I-DLM (Introspective Diffusion Language Model) is a new approach to diffusion-based language models that closes the quality gap with autoregressive (AR) models. The core insight is that existing DLMs lack 'introspective consistency' — they generate tokens without verifying them as AR models implicitly do. I-DLM introduces Introspective Strided Decoding (ISD), which generates new tokens and verifies prior ones in a single forward pass using a p/q acceptance criterion. The 8B model is claimed to be the first DLM to match same-scale AR quality, outperforming LLaDA-2.1-mini (16B) by +26 on AIME-24 and +15 on LiveCodeBench-v6 while delivering 2.9–4.1x throughput at high concurrency. A lossless variant (R-ISD) using gated LoRA guarantees bit-for-bit identical output to the base AR model. The model integrates directly into SGLang with no custom infrastructure, and code, models, and benchmarks are publicly available.

7m read timeFrom introspective-diffusion.github.io
Post cover image
Table of contents
AbstractWhy Introspective Consistency?The I-DLM MethodResultsSpeedup Factor ExplorerDocumentation & ResourcesCitation

Sort: