The blog post discusses the connection between diffusion models and autoregressive models, highlighting that diffusion models can be seen as performing approximate autoregression in the frequency domain. It demonstrates this connection through signal processing and spectral analysis, using Python code to reproduce plots and analyses. While diffusion models show coarse-to-fine behaviour in image generation, this does not translate to audio waveforms. The post also touches on the future of generative models, suggesting a potential shift towards more unified approaches across modalities.
Table of contents
Two forms of iterative refinementA spectral view of diffusionWhat about sound?Unstable equilibriumClosing thoughtsAcknowledgementsReferencesSort: