We are excited to announce the pre-release of VeRL-Omni, a general reinforcement learning (RL) post-training framework focused on multimodal generative models,

vLLM

VeRL-Omni is a pre-release RL post-training framework for multimodal generative models, built on top of verl and vllm-omni. It extends RL training beyond LLMs to diffusion and omni-modality models (image, video, audio), supporting architectures like DiT and mixed AR-DiT. Key features include efficient multimodal rollout via vLLM-Omni, a flexible reward engine supporting VLM-as-judge, modular training backends (DiffusersFSDP/Megatron/VeOmni), and support for both NVIDIA GPUs and Ascend NPUs. A demo shows Qwen-Image trained with FlowGRPO on an OCR reward task, achieving ~14% wall-clock reduction via async reward evaluation. The roadmap includes fully async RL pipelines, broader model/algorithm support, and deeper vLLM-Omni co-optimization.

Announcing VeRL-Omni: Easy, Fast, and Stable RL Training for Diffusion and Omni-Modality Models