VideoPoet is a large language model (LLM) that can perform various video generation tasks, such as text-to-video, image-to-video, video stylization, video inpainting and outpainting, and video-to-audio. It integrates multiple video generation capabilities within a single LLM. The model can generate long videos, edit existing video clips, apply motion to images, and control camera movements. It has been evaluated favorably for text fidelity and motion interestingness.

7m read timeFrom blog.research.google
Post cover image
Table of contents
OverviewLanguage models as video generatorsExamples generated by VideoPoetEvaluation resultsConclusionAcknowledgements

Sort: