Machine Learning: Transforming the Film and TV Industries
By Richard Kerris
Issue: April-May-June 2021

Machine Learning: Transforming the Film and TV Industries

The rise of machine learning and AI are taking graphics and creativity to a new level, helping transform an industry that relies on convincing an audience that what they're seeing on screen is as real - or as close as it gets - to what they see with their eyes.

Content creators and studios are using machine learning and AI to complete repetitive tasks faster and allow for more creative iteration to perfect the finished product. They can also speed time from the set to screen, and increase production for in-home entertainment subscription services.

Instead of coding software with specific instructions to complete a task, a machine is "trained" using large amounts of data, giving it the ability to learn how to perform the task. With machine learning, patterns can be found from any kind of data, from images to numbers. Relieving pressure on the technical workflow frees actors to work on their performances, and editors to focus more on other areas.

The Fountain of Youth and More

De-aging is just the tip of what machine learning can do. With AI-powered denoising, creators can remove grains from their renders and create noise-free, interactive images in real time.

Users get immediate visual feedback, so they can see and interact with their latest designs and even experiment with different elements such as materials, textures, lights, and shadows before finalizing a scene.

Creators can also remaster content using AI-enhanced solutions. Integrating GPU-accelerated machine learning into super-resolution techniques allows studios to achieve higher visual fidelity and increased productivity.

Nvidia Deep Learning Super Sampling (DLSS) opens doors for real-time productions in virtual production. With DLSS, studios and artists can turn low-resolution rendered images into high-resolution ones. The technology helps accelerate workflows in animation and virtual production. By adding DLSS to creative pipelines, studios also are enhancing rendering performance and image quality with real-time ray tracing.

Machine Learning
AI-powered VFX helped ILM de-age De Niro in The Irishman.

Strike a Pose

Pose estimation technology is making motion capture much easier. Previously, studios would need multiple cameras, bodysuits with sensors, and calibrations to capture an actor's body movements and produce 3D re-creations. With machine learning and AI, pose estimation technology detects and matches a person's movements from a single video feed, without the extra bodysuits and cameras.

Pose estimation software has led to advanced breakthroughs like Omniverse Audio2Face, an AI-powered application that generates facial animations from audio sources. Based on Nvidia research, Audio2Face takes an animation of a 3D character and matches it to any voice-over track. It works by feeding the audio to a deep neural network, and the output of the network creates the facial animation in real time. Breakthrough technologies like what is in Audio2Face are paving the way for new forms of interaction between humans and computers, with more realistic conversations via digital avatars.

The Cutting Edge is Going Mainstream

Today, more studios are accelerating creative workflows by using AI to enhance storytelling. Visual effects are being powered by AI, creating villains like Thanos in Disney's The Avengers, to the de-aging of actors in The Irishman.

Machine learning makes the creation of digital humans possible. Award-winning VFX studio Digital Domain has worked on characters and visual effects for blockbuster movies, but they are also known for DigiDoug, the first digital human to give a TED talk in real time. Digital Domain's digital human technology is driven by inertial motion capture and a single camera capture for facial animation. Then machine learning is employed to capture and showcase emotions in real time.

The team starts the process by taking thousands of images of a person's face, using different angles and lighting to capture as much data as possible. Then deep neural networks put the data puzzle pieces together, outputting in a virtual human that acts like a real person.

Machine learning is also beginning to play a part in new features from streaming services. For example, these services are using AI to power recommendation engines that provide audiences with personalized content based on a user's viewing history. Recommendation engines are possible with technology like Nvidia Jarvis, which is an application framework for conversational AI services, which could allow you to simply ask the service for a show you might like. Companies also use AI to optimize streaming quality at lower bandwidths by using Nvidia GPUs to only send what is changing on a per-frame basis, versus sending the entire scene with every packet.

In creative workflows and production pipelines, more applications and tools are incorporating AI features and capabilities. Blackmagic Design'sDaVinci Resolve has a DaVinci Neural Engine that uses GPU-accelerated machine learning to make video editing and color grading more powerful and easier.

Autodesk Flame features machine learning algorithms that help artists extract and generate maps from 2D footage, and accelerate visual effects and compositing workflows. Other leading applications like Adobe Sensei, Autodesk Arnold, Chaos V-Ray, Substance Alchemist, and Notch are providing content creators with AI-powered features like de-lighting captured materials or de-noising for ray tracing and rendering.

Seeing is Believing

In television and film, the Fountain of Youth flows through analyzing big data. The Irishman actors in their late '70s and early '80s didn't need another set of actors to play their younger selves. VFX studio Industrial Light & Magic (ILM) developed software called ILM Facefinder that used AI to go through thousands of images from the actors' previous movie performances.

In HBO's The Righteous Gemstones series, de-aging effects were also powered by AI to make lead actor John Goodman look years younger. VFX studio Gradient Effects used custom software called Shapeshifter that uses AI to analyze facial motion. Using Nvidia GPUs, the VFX team was able to transform Goodman's appearance in a process that took weeks instead of months.

Digital Domain's VFX team used machine learning to animate actor Josh Brolin's performance onto the digital version of theAvengers: Endgame infamous villain, Thanos. Masquerade, which is a machine learning system, was developed to capture low-resolution scans of Brolin's facial movements. Then it transferred his expressions onto the high-resolution mesh of Thanos' face, which allowed the artists to save time as they didn't need to manually animate facial movements to create a realistic digital human.

As technologies continue to evolve, machine learning is making its mark in film and television. AI plays a key role in enhancing creative workflows, from streamlining design processes to accelerating film productions. The power of machine learning is becoming more widely accessible through new applications and software, and more studios will integrate machine learning to create dazzling visual effects and graphics for audiences to enjoy.

Richard Kerris is general manager for Media & Entertainment, the Omniverse platform, and is head of developer relations at Nvidia.