OpenAI has unveiled Sora, its inaugural model designed for creating videos from text.
In a series of social media posts, Sam Altman, the co-founder of OpenAI, announced this new model and stated that the company will start providing access to a “limited number of creators” starting today.
According to the OpenAI website, Sora has the capability to create video clips that are up to a minute long, featuring a wide range of artistic styles, including highly realistic human faces.
The model has the ability to generate complex scenes that include multiple characters, specific types of movement, and intricate details concerning the subject and the background.
Additionally, Sora has the facility to take a still image and generate a video from it or take an existing video and either extend it or fill in missing frames.
Some weaknesses in the model, according to OpenAI, include its difficulty in simulating the physics of a complex scene, its incomplete grasp of cause and effect, and its confusion with spatial details in a claim, such as mixing up left and right.
Therefore, if your video includes, for instance, a runner moving backwards, you may want to try generating it again. Additionally, the model is incapable of creating sound to accompany the video.
Regarding how the company plans to prevent the misuse of the technology with ill intent, OpenAI states that it will incorporate C2PA data, which can assist individuals in tracing the “origin” of a video and verifying that it was generated by artificial intelligence, should Sora be integrated into an OpenAI product.
If Sora is incorporated into an OpenAI offering, a text classifier will screen the claims before processing them to filter out any requests that contain “extreme violence, sexual content, hateful imagery, celebrity likenesses, or other people’s intellectual property.”
Once the video is created, another classifier will review it to check for any prohibited content.
Among the “limited number of creative individuals” who will have early access to try out Sora, they comprise “visual artists, designers, and filmmakers” who will provide feedback to make the model more beneficial for creative professionals.
Artificial intelligence researchers will also gain access to the model to perform penetration testing, a process where teams identify errors and flaws in the model in order to enhance it.
Altman has not disclosed when Sora will be made available to the public. It has obvious potential applications for businesses, such as creating advertisements, producing content for presentations, and generating social media content.
Altman asked social media users for ideas for video captions, saying that he would create and share them in order to give people an insight into Sora’s capabilities.
His first creation was described as “a wizard wearing a pointed hat and a blue robe adorned with white stars, casting a spell that releases lightning from one hand while holding an ancient book in the other.”