Google's Gemini Omni can generate 'anything from any input,' starting with video

3 hours ago 3

Google didn't forget AI creators in its latest round of Gemini announcements.

Google

Google didn't forget AI creators in its latest round of Gemini announcements as part of Google I/O. The company just officially revealed Gemini Omni, a new model that can "create anything from any input — starting with video," according to Google. The first model called Gemini Omni Flash is rolling out today to the Gemini app, Google Flow and YouTube Shorts.

Google called Gemini Omni "the next step" up from Nano Banana and, presumably, its current video generator, Veo 3.1. It lets you "combine images, audio, video and text as input and generate high-quality videos grounded in Gemini's real-world knowledge," according to the tech giant. You can then edit those videos through natural conversation, with each instruction building on the last to keep characters and other elements consistent.

Where Veo 3.1 was limited to video creations via prompts and images, Gemini Omni will accept a wider range of inputs and do a lot more. For instance, you can shoot a video, then just ask Omni to change what's happening. "Your video becomes a starting point for something you never could have filmed yourself," Google explained. "Edit the action, add in new characters or objects, or transform a moment into something unexpected. Change the environment, angle, style or even specific details."

Omni also better understands physical forces like gravity, kinetic energy and fluid dynamics, so that scenes will be more realistic. It marries that with "Gemini's knowledge of history, science and cultural context, bridging the gap from photorealism to meaningful storytelling." The app can supposedly create compelling explainers from short prompts to generate visuals that break down more complex ideas. However, it will only support voice references for audio output to start.

If you want to generate videos where you're the star, Omni lets you use your own voice to create a digital avatar that looks and sounds like you. If that sounds like a potential privacy nightmare, Google says it has "clear policies to protect users from harm and governing the use of our AI tools." As far as editing videos to change audio and speech, the company is still testing that function in order to bring it to users "responsibly." All videos will also use Google's imperceptible SynthID digital watermark to verify that videos were generated with Gemini Omni.

All of that sounds great, but the main problem with Veo 3.1 and other video generator apps is that the video has an "uncanny valley" look, and is often hated by end users. To that end, it'll be interesting to see if the output quality matches Google's breathless claims. We'll find out soon, as Gemini Omni Flash is now available to all Google AI Plus, Pro and Ultra subscribers globally and rolling out to users of YouTube Shorts and the YouTube Create App starting this week.

Read Entire Article