Artificial Intelligence (AI) and Machine Learning (ML) are technologies that enterprises across industries have been keenly experimenting with to explore the utility they can bring. Is there AI adoption within the M&E industry? Can AI be the solution for enterprises seeking automation? Have we cracked the AI code or do we have miles to go? If automation is a goal, it should be a priority even now.
Content recommendation (for OTT), speech-to-text and media recognition are some of the initial applications that have been attempted. Clients find vendor demos to be impressive, but when they do a proof of concept (PoC) with their content, results are not. In video operations, frame accuracy is a necessity and AI models struggle to universally solve for this. And such specific nuances of getting it right, is what makes automation work. After trying multiple vendors, clients conclude that AI data is still not accurate enough to solve specific M&E use cases. However, they remain optimistic about the future possibilities.
So where is the issue? Read More
Tag Archives: VFX
A comprehensive guide to the state-of-art in how AI is transforming the visual effects (VFX) industry
New machine learning techniques being pioneered at the major visual effects studios promise to transform the visual effects industry in a way not seen since the CGI revolution.
It’s over twenty five years since the ground-breaking CGI effects of Jurassic Park usurped 100 years of visual effects tradition. When Steven Spielberg showed the first rushes of computer-generated dinosaurs to acclaimed traditional stop-motion animator Phil Tippett (who had been hired to create the dinosaurs in the same way they had been done since the 1920s) he announced “I think I’m extinct.” It’s a line so significant that it made it into the movie itself, in reference to a paleontologist envisaging a world where no-one would need him to theorize about dinosaurs any longer. Read More
New Artificial Intelligence Tools Will Revolutionize The Visual Effects Industry!
Renowned Visual Effects industry veteran Helena Packer, currently marking her 30th anniversary year working within the VFX arena, is currently working to enhance the next era of the visual effects field by developing new tools which will utilize the powerful advancements in digital technologies offered by Artificial Intelligence (AI). Read More
Hollywood is replacing artists with AI. Its future is bleak.
It took me an embarrassingly long time to realize that the “black mirror” of the popular anthology series Black Mirror was a screen, or rather, all the screens we surround ourselves with: phones, tablets, computers, TVs, and, increasingly, futuristic devices built by massive corporations that monitor our movements and preferences and words. We buy these black mirrors, welcoming them into our homes and lives and letting them — true to their name — reflect ourselves back to us. And as we know all too well, those reflections sometimes betray our darkest impulses.
Unsettling reflections are not the black mirrors’ fault. Gadgets are merely assemblages of wires and metal and glass. Devices don’t have a point of view; they operate according to the input they receive, the algorithms and designs and patterns that power the software, written by humans and thus shaded and slanted by human biases. Read More
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer
Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering. Enabling ML models to understand image formation might be key for generalization. However, due to an essential rasterization step involving discrete assignment operations, rendering pipelines are non-differentiable and thus largely inaccessible to gradient-based ML techniques. In this paper, we present DIB-R, a differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image. Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as an distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models. We showcase our approach in two ML applications: single-image 3D object prediction, and 3D textured object generation, both trained using exclusively using 2D supervision. Our project website is: https://nv-tlabs.github.io/DIB-R/ Read More
Virtual robots that teach themselves kung fu could revolutionize video games
In the not-so-distant future, characters might practice kung-fu kicks in a digital dojo before bringing their moves into the latest video game.
AI researchers at UC Berkeley and the University of British Columbia have created virtual characters capable of imitating the way a person performs martial arts, parkour, and acrobatics, practicing moves relentlessly until they get them just right.
The work could transform the way video games and movies are made. Read More
Roy Orbison and Buddy Holly Hologram
How Avengers: Endgame’s Visual Effects Were Made | WIRED
Could Artificial Intelligence Spell the End of Independent Filmmaking?
A new kind of AI technology can identify elements that might make a film perform better at the box office. But as creator Sami Arpa explains, the creative process is still key good to good movies. Read More
Language2Pose: Natural Language Grounded Pose Forecasting
Generating animations from natural language sentences finds its applications in a a number of domains such as movie script visualization, virtual human animation and, robot motion planning. These sentences can describe different kinds of actions, speeds and direction of these actions, and possibly a target destination. The core modeling challenge in this language-to-pose application is how to map linguistic concepts to motion animations.
In this paper, we address this multimodal problem by introducing a neural architecture called Joint Language-toPose (or JL2P), which learns a joint embedding of language and pose. This joint embedding space is learned end-toend using a curriculum learning approach which emphasizes shorter and easier sequences first before moving to longer and harder ones. We evaluate our proposed model on a publicly available corpus of 3D pose data and humanannotated sentences. Both objective metrics and human judgment evaluation confirm that our proposed approach is able to generate more accurate animations and are deemed visually more representative by humans than other data. Read More