Google is launching Gemini Omni, a new multimodal AI model focused on video generation and editing. The first release, Gemini Omni Flash, accepts any combination of images, audio, video, and text as input to generate and edit high-quality videos using natural language instructions. Key capabilities include conversational multi-turn video editing, physics-aware scene generation, knowledge-grounded storytelling, and a personal avatar feature. It is rolling out to Google AI Plus/Pro/Ultra subscribers via the Gemini app and Google Flow, and free to YouTube Shorts users. All generated videos include SynthID watermarking for AI content transparency. API access for developers and enterprise customers is coming in the following weeks.

5m read timeFrom blog.google
Post cover image

Sort: