Ovi is an open-source audio-video generation model that simultaneously creates synchronized 5-second videos and audio from text or text+image inputs. The 11B parameter model supports flexible resolutions (720×720 to 960×960), multiple aspect ratios, and includes a custom-trained 5B audio branch. It offers inference options for
Table of contents
Video Demo🌟 Key Features📋 Todo List🎨 An Easy Way to Create📦 InstallationDownload Weights🚀 Run Examples🙏 Acknowledgements🤝 Collaboration⭐ Citation2 Comments
Sort: