What is ‘sora’, OpenAI’s revolutionary text-to-video tool that can make films in the future

OpenAI

OpenAI, the author of ChatGPT, has introduced a new type of artificial intelligence that generates realistic video based on text inputs, eliciting astonished reactions online.

The text-to-video model, dubbed Sora, has “a deep understanding of language” and can develop “compelling characters that express vibrant emotions,” according to a blog post published on Thursday by OpenAI.

Sora, which means sky in Japanese, is “able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” the Microsoft-backed startup said.

“The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”

The model appears to have a thorough comprehension of the scene’s 3D space, as well as the subject, object, and their relationship. It extends beyond making moving pictures and 2D images. It maintains visual quality, and character consistency, and adheres to the user’s prompt. OpenAI CEO Sam Altman previously stated that multi-model is the future of AI.

Altman asked people on X (now Twitter) to submit prompts for Sora before uploading the results, which included genuine recordings of two golden retrievers podcasting on top of a mountain, a granny preparing gnocchi, and marine critters competing in an ocean-top bicycle race.

OpenAI sets strict safety measures for ‘sora’ before public launch

OpenAI stated in a blog post that it would take numerous key safety precautions before exposing Sora to the general public.

“We are working with red teamers  –  domain experts in areas like misinformation, hateful content, and bias  - who will be adversarially testing the model,” the company said.

“We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”

Although Sora appears to have OpenAI also noted Sora’s flaws, which include trouble with continuity, following a precise camera trajectory, and recognizing left from right. Sora sometimes generates absurd conditions.

“For example, a person may take a bite out of a cookie, but the cookie may no longer have a bite mark,” the San Francisco-based business explained.

OpenAI competitors Meta and Google have also exhibited text-to-video AI technologies, but their models do not yield as realistic results as Sora’s.

Despite these obstacles, Sora holds enormous potential. In five or ten years, filmmaking and self-expression could be completely democratized, where anyone can create their own cinematic universe. Video games will become incredibly realistic with these advanced models and upscale. Although issues of safety and deep fakes will remain critical issues.

Exit mobile version