This is how Sora works, the OpenAI tool to generate videos with artificial intelligence

After the explosion of the Artificial intelligence (AI) generation of images at the end of 2022 and 2023, several analyzes predicted that in 2024 it would be video's turn. Until now, publicly available AI video generation tools returned limited results, but on February 15, 2024 OpenAIcompany responsible for ChatGPT and DALL-E, presented Sora, a tool that allows you to generate videos from text instructions.

With this application, You can create videos from scratch or animate existing images. Its results have surprised and generated reactions on social networks, but at the moment it has limitations and its use is restricted to experts. As with this type of technology, questions arise about how it can be used fraudulently or to spread disinformation. We explain what we know about Sora so far.

Sora is a video generation application that is being developed by OpenAI, the company that owns, among others, ChatGPT, which allows you to create moving images from text instructions, photographs or other videos. For the first option, we can enter the description of a woman walking through the streets of Tokyo or a young man reading a book on top of a cloud, and the application will generate a small video fragment that meets these characteristics.

It is also capable of animating a static image or extending a video that has already been recorded, its creators say on the web.

How is it capable of generating moving images? How is it different from DALL-E?

Sora, like DALL-E, another tool from the same company, is a diffusion model: a program trained to relate the patterns of an image to certain words and be able to reconstruct them. Thanks to this process, it is able to generate an image according to a text description, like a Wild West town.

The difference with other imaging tools is that sorainstead of processing tokens (numerical representations of fragments of a word or an image), uses what are known as patches: video fragments that include information such as the duration and movement of pixels that appear on the screen.

By analyzing not only information about the colors or patterns of a photograph, but also how the elements of a video change position over a certain period of time, the program is capable of recreating moving images. In this article you can find more information about how Sora works and what the patches.

For now Sora is not available to the public and there is no date announced for its official release. According to OpenAIhave shared the progress of this tool to “give the public an idea of the artificial intelligence capabilities that are on the horizon.”

At the moment only certain researchers and professionals selected by OpenAI have access to this tool.to study their risks and the threats they may pose to areas such as disinformation, bias or hate speech. The company assures that it has also granted access to some artists and audiovisual experts to study how this tool can be implemented within the creative industry.

Although Sora is not available to the general public, OpenAI has detailed that it currently has some limitations. The first of them is that it can only generate videos that are one minute long and are limited to the visual section; these images are not accompanied by audio.

Other limitations have to do with physics and the collision of the elements that appear in the videos. For example, it is unable to recreate the fall of a glass to the ground, it has difficulties showing the interaction of people and objects with each other, and, sometimes, the elements on the screen can be superimposed, as happens in this example with the animals.

This tool also has problems understanding the direction of some movements (it can confuse forward with backward) and following camera movement. Some videos show more errors than others, like this video released by Sam Altman, CEO of the company, in which the characters go backwards and overlap each other.

Experts warn of the risk that such a tool could pose for the spread of deepfakes (hyperrealistic videos created with AI) and some users on social networks point out that they could be used to generate sexual content without the consent of the victims, something that is already happening with other AI tools.

OpenAI has assured that Sora has protection mechanisms that prevent the generation of violent or sexual content or that impersonates celebrities without their consent. According to the company, the application will have the same system as DALL-E to reject instructions that may violate usage policies. But previous research shows that there are ways to use them prompts to bypass these security measures and obtain fraudulent and improper results.

OpenAI argues that it will review every frame of the videos that Sora generates before showing them to the user, but it does not give more information about how this system will work and, since it has not been released to the public, its effectiveness has not been proven at the moment. There is also the risk that these types of tools develop biases or can be used fraudulently to commit scams.

The company has also highlighted that in the future protocols adopted by the Coalition for Content Provenance and Authenticity (C2PA) will be implemented, such as a certificate that allows verifying the origin of a video created with Sora. Other tools such as watermarks propose a similar idea, but they are not a definitive solution.

Images and videos can be a great source of misinformation if they are manipulated or shared without proper context. We see this daily with the misinformation that goes viral in relation to the Russian invasion of Ukraine, the COVID-19 pandemic or any other topic. This was already happening before generative AI became popular, but this technology is a new tool to generate content that can misinform.

We have seen this with the democratization of AI image generation tools: in Factchecked We have denied misinformation about Pope Francis, politicians or activists that were spread as real images, but were generated with AI. Also to disinform about the attacks between Hamas and Israel. In the case of video, we have seen it with videos manipulated with AI from Elon Muskor Volodymyr Zelensky.

Until now we said that the deepfakes were not yet the problem as they were not available to everyone on the Internet, but this type of cheapfakes. Now, a tool like Sora with quick and more polished and refined results can make it increasingly difficult to identify what is real and what is made with AI. The details that we could focus on will become obsolete, just as has happened with images generated with AI.

This can pose a risk for misinformation, as other experts have also noted. For example, among the first examples that OpenAI has released to show Sora's capabilities are alleged recordings of mammoths, historical recordings of California during the Wild West, images from news programs and other types of events that have not taken place. Images that could be used to fuel conspiracy theories or generate historical-looking images that did not happen, as has already happened with images created with applications such as Midjourney or DALL-E. Even if they are not generated with that intention, when content reaches the Internet, we lose control over how it is shared.

OpenAI has indicated that it is working with specialized teams to reduce the dangers of misinformation that this tool can pose and check if Sora's security measures can be violated. However, the company has not detailed which experts it is collaborating with.

This is an article in partnership with Factchequeado, a verification medium that builds a Spanish-speaking community to counteract disinformation in Spanish in the United States. Do you want to be part? Join and verify the content you receive by sending it to our WhatsApp +16468736087 or to factchequeado.com/whatsapp.