New generative media tools Veo and Imagen 3 unveiled

Among the various announcements at this year’s Google I/O, the tech giant introduced two new generative media technologies: Veo, the latest video generation model, and Imagen 3, the most advanced text-to-image model yet.

Veo can generate high-quality videos at 1080p resolution in a variety of cinematic styles, lasting over a minute, according to Google.

“With an advanced understanding of natural language and visual semantics, it generates video that closely reflects a user’s creative vision – accurately capturing the tone of a prompt and displaying details in longer prompts,” the company said in a blog post.

The model is also said to understand cinematic terms like “timelapse” or “aerial shots of a landscape” and create videos that are consistent and realistic. During the announcement, Google also showed how it is working with filmmaker Donald Glover and his experience with the model.

Google has rolled out Veo to select creators in private preview through VideoFX, with future integrations expected into YouTube Shorts and other products.

On the other hand, Imagen 3 is Google’s latest text-to-image generation model, capable of photorealistic rendering with minimal visual artifacts, the company said.

“Imagen 3 better understands the natural language and intent behind your prompt and picks up small details from longer prompts. The model’s advanced understanding helps it master a range of styles,” it added.

Imagen 3 is said to excel at rendering text, something that has been a challenge for image generation models. Google believes that this can provide opportunities for creating personalized content, such as birthday messages, title slides in a presentation, etc.

Imagin 3 will offer select creators the option via private preview in ImageFX, with plans for integration into Vertex AI in the near future.

Also read: Google I/O 2024: Search just got smarter and simpler with AI