Google says its new image AI can actually spell

3 months ago 39

Jay Peters is a news editor covering technology, gaming, and more. He joined The Verge in 2019 after nearly two years at Techmeme.

Google is launching a new version of its image generation model, called Imagen 4, and the company says that it offers “stunning quality” and “superior typography.”

“Our latest Imagen model combines speed with precision to create stunning images,” Eli Collins, VP of product at Google Deepmind, says in a blog post. “Imagen 4 has remarkable clarity in fine details like intricate fabrics, water droplets, and animal fur, and excels in both photorealistic and abstract styles.” Sample images from Google do show some impressive, realistic detail, like one showing a whale jumping out of the water and another of a chameleon.

1/3An image of whales in the water created by Imagen 4. Image: Google

The AI model is also “significantly better at spelling and typography,” which Collins says makes it easier to create greeting cards, posters, and comics. (When OpenAI recently added image generation to ChatGPT, the company also touted its text rendering improvements, but it’s still susceptible to typos.)

In some images provided by Google, the text does look good — it’s perfectly legible in a short comic, for example, and even a tiny font in a mock stamp is readable. But we’ll have to see how the model’s text rendering capabilities hold up in the hands of regular users.

1/4An image of a bag of flour created by Imagen 4. Image: Google

Imagen 4 will be available on May 20th in the Gemini app, Whisk, and Vertex AI, as well as in Slides, Vids, Docs, “and more in Workspace,” Collins says. Also, Google plans to launch a “fast variant” of Imagen 4 sometime “soon,” which it says is “up to 10x faster than Imagen 3.”

Read Entire Article