Google's Gemini Omni Flash hits the API, turning enterprise video production into a conversation
Google's Gemini Omni Flash, part of the new 'Omni' family, is transforming enterprise video production by enabling conversational editing through an API. This model unifies text, images, and video inputs to deliver a finished clip with synced audio, reducing the need for multiple tools and vendors. It allows for multimodal references, a physics engine for brand assets, and text/logo insertion. The model runs on Google's interactions API, enabling multi-turn tasks for coherent edits. With competitive pricing at $0.10 per second for 720p video, Omni Flash offers a cost-effective solution, although limited to 720p output. The model prioritizes provenance and safeguards against deepfakes.
Google's Gemini Omni Flash, part of the new 'Omni' family, is transforming enterprise video production by enabling conversational editing through an API. This model unifies text, images, and video inputs to deliver a finished clip with synced audio, reducing the need for multiple tools and vendors. It allows for multimodal references, a physics engine for brand assets, and text/logo insertion. The model runs on Google's interactions API, enabling multi-turn tasks for coherent edits. With competitive pricing at $0.10 per second for 720p video, Omni Flash offers a cost-effective solution, although limited to 720p output. The model prioritizes provenance and safeguards against deepfakes.
Stay on AIInformants — take action
Generate shareable copy, build a research brief, or publish your own analysis.
Open in Writer →Create content about Google's Gemini Omni Flash hits the API, turning enterprise video production into a conversation