AI Gymnastics Video Goes Viral—And Not for the Right Reasons!
2024-12-13
Author: Wei
AI Gymnastics Video Goes Viral—And Not for the Right Reasons!
On Wednesday, a peculiar video from OpenAI's newly launched Sora AI video generator took social media by storm, showcasing a gymnast in a routine that took a turn for the bizarre—literally. In this viral clip, the athlete sprouts extra limbs and even experiences a brief loss of her head during what is intended to be an Olympic-level floor performance.
The chaotic synthesis errors evident in the video—affectionately dubbed "jabberwockies"—reveal some fascinating insights about the limitations of AI video generators and the areas they need to improve.
What's Happening in This Surreal Spectacle?
Let's break down what's happening in this surreal spectacle. The video presents what appears to be a traditional gymnastics routine, yet it quickly devolves into a whirlwind of unexpected transformations. Notably, around the 9-second mark, viewers are startled to see the gymnast’s head detach and then promptly reattach—a moment so strange it prompts laughter and disbelief.
Insights from Venture Capitalist Deedy Das
Venture capitalist Deedy Das, who shared the video on X, commented, “As cool as the new Sora is, gymnastics remains the ultimate Turing test for AI video." The reactions flooded in, with users poking fun at the absurdity: “Hi, gymnastics expert here! This is not funny; gymnasts only do this when they’re in extreme distress.”
The Creation of the Video
In a conversation about the creation of the video, Das explained that he generated it using a comprehensive prompt—including specific movements and positions—crafted with the help of Anthropic's Claude AI. He noted that while the character consistency appeared improved compared to previous AI models, the results were still unsettling and far from accurate, stating, “Previously... the gymnast would just teleport away… but overall, it still looks downright horrifying.”
What Went Wrong?
So, what went wrong? The answer lies in how Sora understands and generates video content. During the training phase, OpenAI exposed the Sora model to countless gymnastics videos, correlating visual sequences with textual descriptions. However, when it comes to generating a coherent video output, particularly for dynamic movements like gymnastics, Sora struggled. AI currently lacks a genuine comprehension of physics or the human body’s biomechanics; it relies purely on statistical correlations, often leading to nonsensical results when tasked with complex motions.
Challenges Faced by AI Video Generators
This issue isn’t unique to Sora. Many AI video generators have faced similar challenges, a fact highlighted earlier this year with Runway's Gen-3, which produced disjointed and outlandish outputs. In our experiments with Hunyuan Video—a rising competitor to Sora—the same erratic twirling motions were observed even with simpler prompts.
AI Limitations in Original Content Creation
At the core, AI models based on transformer technology excel at imitating and transforming data but fall short on producing truly original content. Success heavily depends on the richness of the training data and the accuracy of the associated metadata. Although OpenAI employed an AI vision model to enhance the labeling of their training videos, it appears that current challenges in producing coherent gymnastics video updates are far from overcome.
The Phenomena of Confabulation
The phenomena observed in this gymnast video showcase what is termed confabulation or “jabberwocky,” that is to say, a nonsensical imitation that lacks coherence. From absurd AI-generated ads to cartoonish interpretations of celebrities, these merging errors represent a growing area of interest—and concern—within the field of AI video generation.
Looking Ahead: Improving AI Video Models
How can we improve AI models to prevent these bizarre miscalculations? Experts suggest that much like the breakthrough with language models achieving a notable level of understanding, video synthesis will need equally vast datasets along with enhanced computational capabilities. There’s hope that progress in AI video synthesis will mirror the rapid advancements observed in AI image generation—transforming from rudimentary visuals to strikingly lifelike imagery.
The Path Forward
Until these improvements arrive, brace yourself for more of the jaw-dropping and unintentionally hilarious results emanating from AI video generation. The journey toward coherent, realistic AI content continues, and we’re here for every wild twist along the way!