Tag Archives: VFX
Will OpenAI’s Critterz make or break AI filmmaking?
You may have missed the AI movie Critterz when it appeared as a short animation a couple of years ago. It didn’t exactly set the world on fire, with comments on YouTube including “I’d call this garbage, but that’d be an insult to garbage” and “This was the worst 5 minutes I will never get back”.
Nevertheless, it seems OpenAI, the maker of Chat GPT, saw potential in the ‘nature documentary turned comedy’. It’s putting its name behind the experimental short’s expansion into a feature-length movie intended for a debut at the Cannes Film Festival in May 2026 followed by a full cinema release. Will it show that AI is ready to take on Hollywood and slash the costs of filmmaking, or will it do the opposite like ‘Netflix of AI’ Showrunner? — Read More
OpenAI Is Bringing an AI-Driven Feature-Length Animated Movie to Cannes
You knew it was bound to happen, and now, it has. The Wall Street Journal reports that OpenAI is lending its services to the production of a feature-length animated film called Critterz, which is aiming to be done in time for next year’s Cannes Film Festival. That would put its production time at nine months, which is unheard of for a feature-length animated film, but that’s because it’ll be created using AI.
According to the paper, using OpenAI’s resources, production companies Vertigo Films and Native Foreign will hire actors to voice characters created by feeding original drawings into generative AI software. The entire film is expected to cost less than $30 million and will only take about 30 people to complete. — Read More
AI virtual personality YouTubers, or ‘VTubers,’ are earning millions
One of the most popular gaming YouTubers is named Bloo, but he isn’t a human — he’s a VTuber, a fully virtual personality powered by artificial intelligence.
VTubers first gained traction in Japan in the 2010s. Now, advances in AI are making it easier than ever to create VTubers, fueling a new wave of virtual creators on YouTube.
As AI-generated content becomes more common online, concerns about its impact are growing, especially as it becomes easier to generate convincing but entirely AI-fabricated videos. — Read More
How to Become a VTuber
One-Minute Video Generation with Test-Time Training
Transformers today still struggle to generate one-minute videos because self-attention layers are inefficient for long context. Alternatives such as Mamba layers struggle with complex multi-scene stories because their hidden states are less expressive. We experiment with Test-Time Training (TTT) layers, whose hidden states themselves can be neural networks, therefore more expressive. Adding TTT layers into a pre-trained Transformer enables it to generate one-minute videos from text storyboards. For proof of concept, we curate a dataset based on Tom and Jerry cartoons. Compared to baselines such as Mamba 2, Gated DeltaNet, and sliding-window attention layers, TTT layers generate much more coherent videos that tell complex stories, leading by 34 Elo points in a human evaluation of 100 videos per method. Although promising, results still contain artifacts, likely due to the limited capability of the pre-trained 5B model. The efficiency of our implementation can also be improved. We have only experimented with one-minute videos due to resource constraints, but the approach can be extended to longer videos and more complex stories. — Read More
Runway releases an impressive new video-generating AI model
AI startup Runway on Monday released what it claims is one of the highest-fidelity AI-powered video generators yet.
Called Gen-4, the model is rolling out to the company’s individual and enterprise customers. Runway claims that it can generate consistent characters, locations, and objects across scenes, maintain “coherent world environments,” and regenerate elements from different perspectives and positions within scenes. — Read More
MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning
Existing long-form video generation frameworks lack automated planning, requiring manual input for storylines, scenes, cinematography, and character interactions, resulting in high costs and inefficiencies. To address these challenges, we present MovieAgent, an automated movie generation via multi-agent Chain of Thought (CoT) planning. MovieAgent offers two key advantages: 1) We firstly explore and define the paradigm of automated movie/long-video generation. Given a script and character bank, our MovieAgent can generates multi-scene, multi-shot long-form videos with a coherent narrative, while ensuring character consistency, synchronized subtitles, and stable audio throughout the film. 2) MovieAgent introduces a hierarchical CoT-based reasoning process to automatically structure scenes, camera settings, and cinematography, significantly reducing human effort. By employing multiple LLM agents to simulate the roles of a director, screenwriter, storyboard artist, and location manager, MovieAgent streamlines the production pipeline. Experiments demonstrate that MovieAgent achieves new state-of-the-art results in script faithfulness, character consistency, and narrative coherence. Our hierarchical framework takes a step forward and provides new insights into fully automated movie generation. — Read More
DynVFX: Augmenting Real Videoswith Dynamic Content
We present a method for augmenting real-world videos with newly generated dynamic content. Given an input video and a simple user-provided text instruction describing the desired content, our method synthesizes dynamic objects or complex scene effects that naturally interact with the existing scene over time. The position, appearance, and motion of the new content are seamlessly integrated into the original footage while accounting for camera motion, occlusions, and interactions with other dynamic objects in the scene, resulting in a cohesive and realistic output video. We achieve this via a zero-shot, training-free framework that harnesses a pre-trained transformer-based text-to-video diffusion model to synthesize the new content and a pre-trained Vision Language Model to envision the augmented scene in detail. Specifically, we introduce a novel sampling-based method that manipulates features within the attention mechanism, enabling accurate localization and seamless integration of the new content while preserving the integrity of the original scene. Our method is fully automated, requiring only a simple user instruction. We demonstrate its effectiveness on a wide range of edits applied to real-world videos, encompassing diverse objects and scenarios involving both camera and object motion. — Read More
Avataar releases new tool to create AI-generated videos for products
Generative AI models have reached a baseline capability of producing at least a passable video from a single image or short sentence. Companies building products around these models are claiming that anyone can make a snazzy promo video if they have some images or recordings — and videos usually perform better than static images or documents.
Peak XV and Tiger Global-backed Avataar released a new tool on Monday called Velocity. It creates product videos directly based on a product link. The company would be going against the likes of Amazon and Google, which are also experimenting with AI-powered video tools for ads. — Read More
HunyuanVideo
We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. In order to train HunyuanVideo model, we adopt several key technologies for model learning, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models. — Read More