Thoughts in Motion
Issue: Volume: 28 Issue: 8 (August 2005)

Thoughts in Motion

At the very least, a character must not break whatever illusion has allowed the player to become immersed in the gaming experience. This might seem like a low bar to set, but when you consider how rare it is that a synthetic character meets even this low bar, it’s actually a good place to start.

Rendering, and by that I mean everything that has to do with the visual appearance of a character, including the model, textures, shaders, etc., is becoming less and less of an issue. It wouldn’t be too far fetched to claim that in the feature-film effects world, the rendering problem has been pretty much solved-assuming you’ve got the talent to run the software and the CPU horsepower to back it all up.

In the video-game world, the rendering problem is also close to being solved with the new round of game consoles-and will probably catch up with the feature-film world in another five to six years-along with the subsequent generation of hardware. So rendering isn’t really the problem. What hasn’t been solved is the issue of animation-how the model moves-and the subtle details of that movement.

In Electronic Arts’ recent game project, The Lord of the Rings: The Third Age, the artists show that simpler animations can be more dramatic and have a larger impact.

One way to animate your characters is with motion capture. At the current time, there is one big technical problem and one big artistic problem associated with this technology.

The technical problem is that you can’t get all the data at the same time. Full-body motion-capture systems don’t quite have the resolution to capture the subtleties required to make faces work. Most of the facial motion-capture systems can’t be attached to the actor during full-body capture without impeding the full-body performance. Some of the facial-capture systems don’t capture the eyes-and if the system can’t be attached to the full-body performance, how could the eyes look in the right place anyway? This problem will go away some day fairly soon, but for now it remains.

The artistic problem involves the actors: If your human performers don’t do the right thing, your captured data won’t be any good. That’s why real athletes have to be used to capture animation for sports games. Conversely, if you want to capture dramatic storytelling, you need real actors, a real script, real blocking, a real director, and lots of rehearsals.

In our recent game project, The Lord of the Rings: The Third Age, the best motion-captured storytelling sequences occur in the simplest scenes. Probably the best one involves a few characters sitting around a campfire, talking. It works the best because the actors could focus on their craft and emotional state, and because the dialogue and blocking are good, and so forth-in other words, we did the art correctly. Then, because the movements and blocking are simple, we got all the data (full body, face, and eyes) to line up right-we successfully worked around the technical problems. I might add that all those things were able to happen because the cast and the crew rehearsed like crazy.

You could replace all the techniques I’ve just recommended with some really good animators and a great animation director. Whether that’s the right solution for your project, as opposed to the actor/director/motion-capture route, depends on whom you have working for you, and on the artistic style of your project.

While the techniques discussed earlier work well if you know in advance exactly which characters will interact in what ways, they obviously fall short if any significant variables are introduced-such as those occuring within the interactive environments of games.

Someday, in order to meet this challenge, we’ll be able to synthesize human motions down to the smallest muscle movement. But that day is not today. Right now, the most practical solution is to divide movements into four categories: facial expressions, lip sync, head and eye tracking, and full-body gestures.

Facial expressions and lip sync have probably been around the longest. There are lots of different libraries full of happy, sad, angry, and other facial expressions, and morphing between these works fairly well. Lip sync is also reasonably well understood in terms of different mouth shapes for phonemes or other, similar systems. Both types of animation are significant simplifications of reality, but they work pretty well if implemented carefully and by talented artists and engineers.

Head and eye tracking are making some big strides forward, as well. Algorithms have been developed here at Electronic Arts that take into account the emotional state of the character being animated, as well as the emotional relationship between that character and other entities in the scene (characters and objects). These algorithms move the head and eyes in a coordinated manner and have proved successful in giving the player the illusion that the character has an emotional state-that is, has believable thoughts.

We are also experimenting with something like facial expres-sions for full-body gestures. Using information from psychology, as well as executive and sales training literature, we’re assembling animation building blocks for using body language to communicate an AI character’s state of mind to the player.

In summary, we can create much more believable AI-driven characters in years to come, but we also need to concentrate on what can be done to make believable characters now. There is a lot of work yet to be done in order to make them “think” and have any sense of true intelligence, but even without that, there is much we can do now, with carefully planned and rehearsed motion capture, animation, etc., to make far more believable synthetic characters. The key is to concentrate on the details.

Steve Gray
is currently executive producer for The Lord of the Rings at Electronic Arts’ Redwood Shores Studio.