Google Creates Impressive AI-Powered Tool That Generates Video From Text Inputs

Following in the footsteps of Meta and DALL-E, Google has decided to get on board with a new and impressive AI-powered tool. This generates great videos after text inputs are provided.

And while many felt Meta’s Mark Zuckerberg was the only one ahead of the game in this regard, well, here’s some great news for you.

The decision comes as the firm’s researchers working on Google’s highly reputed AI lab called Google Brain went public with the debut of their Imagen Video. This highly innovative tool is designed to make the most real-looking clips that go above and beyond classic still pictures. And the end result is the best creative videos that stay very consistent within every frame.

Speaking in a recent paper, Google says its Imagen Video is really something to be acknowledged and held in high regard. It can produce the best videos of great quality and also those that have a wonderful form of controllability. The end result is very diverse videos and a bucketload of text animations that arise in different styles. It’s a great form of understanding the diverse world of 3D objects.


On average, we’re talking of a device that can produce videos that are 5 seconds long and have a resolution comprising 1280 by 768. The frames are set to run at 24 per second. The program comes thanks to the bright researchers who train computer models to better comprehend still images and videos of the sort.

These come labeled using a text description. A perfect replication is carried out in the form of videos after they’re provided with the right text prompt. The model trains on both video data and images. But remember, these results are literally far from perfect.

After a few recent uploads were put forward by Google, we got the chance to witness how the model couldn’t process complex forms of movements. For instance, a video of a panda eating bamboo shots or ships sailing at sea was hard to comprehend.

But keeping that notion aside, we feel this is the best form of video creation out there. Did we mention the speed being less than one minute for a video’s creation?

It’s quite clear that Google is refraining from releasing more details on the project regarding the exact types of technology used. We’re seeing so many safeguards come forward to stop the tool from making content that comes under the category of fake, harmful, and even explicit.

There is a major concern regarding stereotypes as the models were trained on a limited set of data. And while there is evidence that plenty of content that’s violent and derogatory does get filtered, you can’t forget about the various forms of biases and even stereotypes out there. They can be hard to detect sometimes and remove.

Therefore, Google says it's not 100% satisfied with the results and hopes to carry out a proper launch only after such issues are removed.

Read next: According to the report, Google's Home app is likely to undergo renovation
Previous Post Next Post