OpenAI's Sora: A New Frontier in Video Generation Technology

OpenAI's Sora: A New Frontier in Video Generation Technology

By Staff Writer, 16 February 2024

OpenAI, the company that made waves in the tech world with its ChatGPT, is now venturing into the realm of video generation with its latest offering, Sora.

Unveiled on Thursday, Sora represents OpenAI's latest foray into generative AI models.

Drawing parallels to its renowned image-generation AI tool, DALL-E, Sora operates on a similar premise.

Users input their desired scene, and Sora responds by producing a high-definition video clip.

Additionally, Sora has the capacity to generate video sequences inspired by static images, as well as extend existing videos or fill in missing frames.

The emergence of video as a focus for generative AI follows the path paved by chatbots and image generators, which have already permeated both consumer and business domains.

While the creative potential of these advancements excites enthusiasts, they also raise concerns about the proliferation of misinformation, particularly with significant political events on the horizon.

Data from Clarity, a machine learning firm, indicates a staggering 900% increase in AI-generated deepfakes year over year.

OpenAI's entry into the video-generation arena positions it in competition with industry players like Meta and Google, the latter having unveiled its own video-generation AI tool, Lumiere, in January.

Other startups, including Stability AI with its Stable Video Diffusion product, offer similar AI-driven video generation capabilities.

Amazon, too, has joined the fray with Create with Alexa, a model tailored for generating prompt-based short-form animated content for children.

At present, Sora is restricted to producing videos of one minute or less in duration.

OpenAI, with the backing of tech giant Microsoft, is striving toward multimodality — the integration of text, image, and video generation — as part of its broader initiative to offer a diverse array of AI models.

OpenAI's Chief Operating Officer, Brad Lightcap, emphasized the significance of multimodal capabilities, stating, "The world is multimodal... the world is much bigger than text." He underscored the necessity of expanding beyond single modalities like text and code to fully leverage the potential of AI models.

While Sora has undergone testing by a select group of safety evaluators, dubbed "red teamers," to identify vulnerabilities related to misinformation and bias, the company has yet to publicly demonstrate its capabilities beyond 10 sample clips available on its website.

A technical paper accompanying Sora's release is expected to be published later on Thursday, shedding further light on its workings.

Source: Hayden Field / CNBC

Related Report

Opportunities in Web3

Dive into the world of Web3, where groundbreaking technologies create boundless opportunities.

Subscribe To Our Newsletter

Stay up to date with the latest news, special reports, videos, infobytes, and features on the region's most notable entrepreneurial ecosystems