Pyramid Attention Broadcast (PAB) is a new method that enables real-time, high-quality video generation using diffusion transformer models without compromising output quality. By identifying redundancies in attention computations, PAB significantly speeds up the process, achieving up to 10.5x faster generation times for videos up to 720p resolution. PAB works by targeting the stability in attention differences during the diffusion steps and applying appropriate broadcast ranges for different types of attention. This innovative approach allows for efficient, scalable distributed inference, making AI-driven high-res video generation practical for real-world applications.
Sort: