Runway releases an impressive new video-generating AI model

AI startup Runway on Monday released what it claims is one of the highest-fidelity AI-powered video generators yet.

Called Gen-4, the model is rolling out to the company’s individual and enterprise customers. Runway claims that it can generate consistent characters, locations, and objects across scenes, maintain “coherent world environments,” and regenerate elements from different perspectives and positions within scenes. — Read More

#vfx

MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning

Existing long-form video generation frameworks lack automated planning, requiring manual input for storylines, scenes, cinematography, and character interactions, resulting in high costs and inefficiencies. To address these challenges, we present MovieAgent, an automated movie generation via multi-agent Chain of Thought (CoT) planning. MovieAgent offers two key advantages: 1) We firstly explore and define the paradigm of automated movie/long-video generation. Given a script and character bank, our MovieAgent can generates multi-scene, multi-shot long-form videos with a coherent narrative, while ensuring character consistency, synchronized subtitles, and stable audio throughout the film. 2) MovieAgent introduces a hierarchical CoT-based reasoning process to automatically structure scenes, camera settings, and cinematography, significantly reducing human effort. By employing multiple LLM agents to simulate the roles of a director, screenwriter, storyboard artist, and location manager, MovieAgent streamlines the production pipeline. Experiments demonstrate that MovieAgent achieves new state-of-the-art results in script faithfulness, character consistency, and narrative coherence. Our hierarchical framework takes a step forward and provides new insights into fully automated movie generation.  — Read More

#vfx

DynVFX: Augmenting Real Videoswith Dynamic Content

We present a method for augmenting real-world videos with newly generated dynamic content. Given an input video and a simple user-provided text instruction describing the desired content, our method synthesizes dynamic objects or complex scene effects that naturally interact with the existing scene over time. The position, appearance, and motion of the new content are seamlessly integrated into the original footage while accounting for camera motion, occlusions, and interactions with other dynamic objects in the scene, resulting in a cohesive and realistic output video. We achieve this via a zero-shot, training-free framework that harnesses a pre-trained transformer-based text-to-video diffusion model to synthesize the new content and a pre-trained Vision Language Model to envision the augmented scene in detail. Specifically, we introduce a novel sampling-based method that manipulates features within the attention mechanism, enabling accurate localization and seamless integration of the new content while preserving the integrity of the original scene. Our method is fully automated, requiring only a simple user instruction. We demonstrate its effectiveness on a wide range of edits applied to real-world videos, encompassing diverse objects and scenarios involving both camera and object motion. — Read More

#vfx

Avataar releases new tool to create AI-generated videos for products

Generative AI models have reached a baseline capability of producing at least a passable video from a single image or short sentence. Companies building products around these models are claiming that anyone can make a snazzy promo video if they have some images or recordings — and videos usually perform better than static images or documents.

Peak XV and Tiger Global-backed Avataar released a new tool on Monday called Velocity. It creates product videos directly based on a product link. The company would be going against the likes of Amazon and Google, which are also experimenting with AI-powered video tools for ads. — Read More

#vfx

HunyuanVideo

We present HunyuanVideo, a novel open-source video foundation model that exhibits performance in video generation that is comparable to, if not superior to, leading closed-source models. In order to train HunyuanVideo model, we adopt several key technologies for model learning, including data curation, image-video joint model training, and an efficient infrastructure designed to facilitate large-scale model training and inference. Additionally, through an effective strategy for scaling model architecture and dataset, we successfully trained a video generative model with over 13 billion parameters, making it the largest among all open-source models. — Read More

#vfx

Google Veo 2 Demo – The Best AI Video Model Yet

Read More

#vfx, #videos

I Went to the Premiere of the First Commercially Streaming AI-Generated Movies

Movies are supposed to transport you places. At the end of last month, I was sitting in the Chinese Theater, one of the most iconic movie theaters in Hollywood, in the same complex where the Oscars are held. And as I was watching the movie, I found myself transported to the past, thinking about one of my biggest regrets. When I was in high school, I went to a theater to watch a screening of a movie one of my classmates had made. I was 14 years old, and I reviewed it for the school newspaper. I savaged the film’s special effects, which were done by hand with love and care by someone my own age, and were lightyears better than anything I could do. I had no idea what I was talking about, how special effects were made, or how to review a movie. The student who made the film rightfully hated me, and I have felt bad about what I wrote ever since. 

So, 20 years later, I’m sitting in the Chinese Theater watching AI-generated movies in which the directors sometimes cannot make the characters consistently look the same, or make audio sync with lips in a natural-seeming way, and I am thinking about the emotions these films are giving me. The emotion that I feel most strongly is “guilt,” because I know there is no way to write about what I am watching without explaining that these are bad films, and I cannot believe that they are going to be imminently commercially released, and the people who made them are all sitting around me.

Then I remembered that I am not watching student films made with love by an enthusiastic high school student. I am watching films that were made for TCL, the largest TV manufacturer on Earth as part of a pilot program designed to normalize AI movies and TV shows for an audience that it plans to monetize explicitly with targeted advertising and whose internal data suggests that the people who watch its free television streaming network are too lazy to change the channel. I know this is the plan because TCL’s executives just told the audience that this is the plan. – Read More

#vfx

Tencent Hunyuan-Video: Best text-video generation model

Since the announcement of Sora by OpenAI, Chinese tech has picked up some great acceleration and has released many text-video models namely CogVideoX, MiniMax, Kling, etc.

The latest release in the space of text-video is Tencent’s Hunyuan-video which is not just open-sourced but has also occupied top rank in text-video models, beating Gen3 and Luma.

The model looks perfect and can even generate audio for videos (so no more voiceless video generation). — Read More

#china-ai, #vfx

OpenAI’s Sora video generator appears to have leaked

A group appears to have leaked access to Sora, OpenAI’s video generator, in protest of what it’s calling duplicity and “art washing” on OpenAI’s part.

On Tuesday, the group published a project on the AI dev platform Hugging Face seemingly connected to OpenAI’s Sora API, which isn’t yet publicly available. Using their authentication tokens — presumably from an early access system — the group created a front end that lets users generate videos with Sora. — Read More

#vfx

This AI-generated version of Minecraft may represent the future of real-time video generation

The game was created from clips and keyboard inputs alone, as a demo for real-time interactive video generation.

When you walk around in a version of the video game Minecraft from the AI companies Decart and Etched, it feels a little off. Sure, you can move forward, cut down a tree, and lay down a dirt block, just like in the real thing. If you turn around, though, the dirt block you just placed may have morphed into a totally new environment. That doesn’t happen in Minecraft. But this new version is entirely AI-generated, so it’s prone to hallucinations. Not a single line of code was written.

For Decart and Etched, this demo is a proof of concept. They imagine that the technology could be used for real-time generation of videos or video games more generally. “Your screen can turn into a portal—into some imaginary world that doesn’t need to be coded, that can be changed on the fly. And that’s really what we’re trying to target here,” says Dean Leitersdorf, cofounder and CEO of Decart, which came out of stealth this week. — Read More

#vfx