Diffusion-based large language models (LLMs) are emerging as more efficient alternatives to autoregressive models for text generation. Renmin University's LLaDA uses dynamic masking to predict multiple tokens simultaneously in a bidirectional manner, offering better performance in complex reasoning tasks compared to current
•5m read time• From thenewstack.io
Sort: