This is the new Google AI that creates 3D worlds in real time from text or images

Google Deepmind He has announced the new version of his ‘world model’, Genie 3. A ‘world model’ is a type of generative that allows you to create 3D environments With which the user, human or an artificial intelligence agent, he can interact.

Genie 1 and Genie 2launched in the early and late 2024, respectively, seemed rather generating video game worlds, due to the limitations they presented and the aesthetics of the results. His successor supposes an important step forward in realism, consistency of the created world and ability to interact with it.

With Genie 3, Enough Upload an image or introduce written instruction to create the virtual world. Unlike a video game, which is built with the resources created by artists and developers, here Artificial intelligence does everything. In addition, the environment that genie 3 generates continuously can modify on the march -INTRODUCING NEW CHARACTERS, CHANGE OBJECTS OR THE CLIMATE- Through new text instructions. The examples presented by Google give an idea of the versatility of the tool.

https://www.youtube.com/watch?v=pdkhuknuqdg

In front of Genie 2, Deepmind’s new AI uses a resolution of 720p (360p in its predecessor) and 24 images per secondshows a greater capacity for navigation and interaction, the aforementioned possibility of modifying the world at any time and Upload the interaction horizon of 8 seconds to ‘multiple minutes’.

The video in which a user paints a wall, moves away from it leaving it out of the image and then returns to find that the brushes they have given before are maintainedIt is an illustrative example of this capacity.

The created world can also be explored for a longer time, although Google has not specified the number of minutes; ‘A few’ in which he has expanded the only one to which Genie 2 reached.

Although Genie’s first versions were focused on Creation of video gamesGoogle’s aspirations are now greater. In addition to entertainment purposes, Deepmind presents it as An instrument of research and to train robots and AI agents.

One of the problems found by AI companies is New training data scarcity. After feeding the models with practically all existing websites and videos, researchers are resorting to synthetic data for multiple uses. Deepmind believes that world models can be key in this new approach, since They allow to train agents with virtually unlimited interactive worlds.

With everything surprising that is Genie 3it also has its problems. Besides Of the limitations commented, it continues to generate Incorrect elements in the video and The texts are illegible.

There are also limits in the way in which AI agents interact with these worlds. Although environments and events with realistic conditions can be created, They cannot modify them. His role is reduced to moving around the simulated world, since they still do not have the necessary ability to influence it. Deepmind continues to experiment with the possibility that Several agents interact with each other in the same environment.

Genie 3, which must require a very important computing capacity, is not available for the general consumer, but Google will grant ‘a small group of academics and creators’ that will help perfect the model. The intention is Increase future availability.