Designed to Please: Ananova
By Jenny Donelan
Not one of the current crop of digital newscasters is a stand-in for the likes of Katie Couric, and-with one partial exception-they don't try to be. Their creators are hoping these cyberanchors will prove so appealing and convenient in their own right that viewers will adopt them into their everyday lives, much as they do live newscasters.
Ananova, the most human of the three virtual news efforts we look at here, is a cybercelebrity who delivers continuously updated news on the Web. The cast of virtual characters from 1KTV-some humanoid, some not-offers Web-based news on demand; and Klik Animation's CG characters produce snappy commentary within the framework of a daily TV news show. The technology behind each of these projects is as different as the end result: Ananova is a 3D model that incorporates morph blending. The 1KTV characters start out as 3D models and become sprite stacks. And the characters of "Le JourNul de Francois Perusse" come to life mostly through motion capture .
Is there a digital Dan Rather in your future? Read on and see.
She was designed from the start to deliver news over the Web, but digital anchorperson Ananova has found herself, at least briefly, in the more traditional role of television newscaster. In the flurry of publicity surrounding her launch last spring, television appearances by the attractive cybercaster were so much in demand that for a while her developers were spending more time preparing her for TV than they were fine-tuning her Web broadcasts. Shortly after Ananova's launch, Jonathan Jowitt, technical project manager for Ananova Ltd., remarked, "Many of the engineering requests I've made in the past few days have been to deal with the sheer amount of product we're committing to tape and shipping out to TV companies."
Whether Ananova will retain her star status on TV, or even on the Web (www.ananova.com), where she delivers the news 24 hours a day, seven days a week, only time will tell, but her credentials are impressive. A little more than a year ago, Ananova Ltd. (formerly PA New Media, the new media arm of the UK news agency, the Press Association), was searching for a digital character to deliver the news provided by the company's recently developed real-time news and information computer system. Digital Animations Group, a studio based in Bellshill, Scotland, was chosen to model and animate the character because it had already been testing techniques for semi-automated facial animation.
Together, the teams at Ananova Ltd. and Digital Animations formulated the character and appearance of the 28-year-old, green-haired Ananova. Much re search went into making her an appealing and, in some ways, believable character. Her face is a composite of features borrowed from photographs of hundreds of real people. "We also spent a lot of time putting together a full character profile on her, because we thought that would help us understand how to animate her," says Laurie McCulloch, development director for Digital Animations Group. "This is all ongoing. There is an extensive bible that dictates who Ana nova is, what she looks like, how she reacts."
At her core, Ananova is a LightWave (NewTek; San Antonio, TX) model created with a relatively low-resolution subdivision-surfaces mesh. "She's easy to animate-there aren't millions of vertices flying about," says McCulloch. "The structure of the mesh is carefully controlled to follow the curvature of the face so that we can control the different muscle groups."
Besides her base model, Ananova has a set of morph positions, also built in LightWave. For each muscle in the face, the animators produce a morph at opposite extremes of movement. One example might be the left eyebrow muscle at full contraction vs. the same muscle at full expansion. "We have an unlimited range of facial movement," says McCulloch, "because we have all the facial muscles in their minimum and maximum positions, and infinite blending capacity between them." Proprietary morph-blending software takes care of moving the muscles in Ananova's face from one expression to the next. The animators then put together different expression sets and save them into a library that's passed to the system's real-time renderer. Texturing is done in LightWave as well.
Digital Animations also developed the real-time side of Ananova, during which the news stories-which consist of text marked up with XML tags to denote her emotions, as well as elements such as lighting and camera angles-are fed through a server to produce the speaking, blinking, smiling, or serious Ananova. A speech synthesis module from Lernout & Hauspie (Ieper, Belgium) generates the speech from the text, and that speech is analyzed by a proprietary software engine that generates lip syncing and facial animation. "Basically what we have at the end of the day," says McCulloch, "is a piece of software that's running on the servers at Ananova Ltd., which are getting fed text every time a news story breaks. These stories are wired straight to Web servers, which are able to serve video of Ananova 24 hours a day."
At Ananova Ltd., a group of very busy editors marks up articles taken from sources including the newswires. They validate and shorten the stories, but also rewrite them so they can be spoken aloud. "What's written on the newswire is a strange journalistic Eng lish," explains Ananova Ltd.'s Jowitt. At the same time, the editor tags the general emotion of the story, and also particular areas that require more facial emotion. "The engine has a notion of some words to stress," explains Jowitt. "Proper names, strong negatives. If she says, 'The Dow Jones has fallen,' she will say 'fallen' a little stronger than the other words." The software engine also ensures that Ananova takes a breath every second or third sentence and blinks at random, as a human would.
The editor can then immediately send the story live. Previewing Ananova's delivery, in the case of tragic news or other sensitive material, is always an option, says Jowitt, as her script can be rendered for preview locally in about 15 seconds. Editors work on scripts located in a server repository that notes when a bulletin is ready for rendering and sends it to whichever Pentium III-based rendering machine happens to be free. "It's kind of like a soup kitchen, where each machine becomes an autonomous device, and its task is to render the whole bulletin and do its utmost to get it out to the 'net." The downside of such efficiency is that in a busy news time, some bulletins live on the Web for such brief intervals that they may not even be seen.
|A full-body model and a simplified head model for creating the correct facial animation were parts of Ana nova's development phase. |
Viewers on the Web run Ananova's scripts using the downloadable RealAudio player from RealNetworks.com (Seattle, WA). What they see is a small window framing her face as she delivers the news in an accent that's mostly American, but slightly British. "We did some tests where she had what we would call a BBC accent, and she just seemed a little aloof somehow," says Jowitt. This being the Web, Ananova's delivery can be somewhat stilted and hesitant. Watching her on television, or in local mode-which is how she ran during a recent keynote appearance with Bill Gates at winHec-"is a much more 'watchable' multimedia experience than we can currently stream out to people over the real network," says Jowitt.
But Ananova's creators hope that end users will grow increasingly comfortable with the newscaster, and with her ability to offer news whenever they feel like listening to it. "What we're finding," says Jowitt, "is that a lot of people are going to the site and popping her up when they've got the odd spare couple of minutes or when they're going through their mail or boiling the kettle." It remains to be seen how much people will warm to Ananova-how, for example, they will receive her as a deliverer of tragic news. But having tested her in this capacity, her creators are confident that she will be accepted.
The Ananova we see now represents the first stage for her and projects relating to her, say her creators. Ananova may be joined by additional digital personnel. And her character, appearance, and delivery will continue to evolve as well. Ananova Ltd. also hopes to see her deliver customized news such as sports or regional entertainment. And while in light of all her recent appearances, Ananova might seem poised to move from the computer to the television screen, Digital Ani mations and Ananova Ltd. are envisioning both larger and smaller venues for her.
|A subdivision-surfaces mesh created in LightWave provides the underpinnings for Ananova's complex facial movements.|
Portable personal communications devices are one area in which she's already been tested. And requests from airports and railway stations may have Ananova or a similar character appearing on some extremely large screens-making announcements, for example. "Ananova was never envisioned to take off in this direction," says Jowitt. "She's morphing into something completely different."
Jenny Donelan is managing editor of Computer Graphics World.
News on Demand:1KTV
By Barbara Robertson
OK, here's the scenario: You were caught in rush hour traffic, got home late, and missed the 6:00 evening news. You're tired. The chances you'll still be awake for the 11:00 news range from slim to none. What you want to do is order a pizza, collapse on the couch, and watch the evening news now, at 7:40. First, you want to watch sports highlights, then world news, and then a business wrap-up. As for tomorrow? To morrow's another day. Tomorrow, maybe you'll want to catch business news at 3:15 and then check world news when you get home at 6:49.
At least that's what the folks at 1KTV hope you want to do, because on their "stations," animated CG newscasters can broadcast headline news, weather, and sports via the Web, when you want, with a click of your mouse button. "We're giving people the passive experience of watching television in an interactive form," says Kenn Raaf, manager of IKTV and marketing manager for V-Star (Los Angeles), the technology company that launched 1KTV.
Raaf envisions 1KTV taking shape as a group of local and regional "stations," each with its own Web site. The company's first site, serving the Los Angeles area, went live in February. If you point your browser to www .1KTVLA.com, you can get business news from Doug Bonds, sports from Mark Fields, weather from Corkie McCloud, and headline news from Paula deAngeles, all humanoid animated characters with real-time lip sync. In addition, six "lifestyle characters" such as Daffy Giraffy for kids, and Sir Charles for adults offer lighter stories and suggestions for things to do in LA. And a few "silly" characters, as Raaf calls them, liven up the site with late-night TV-style comedy shows that feature material drawn from current events.
|Powering these lip-syncing digital news, weather, and sportscasters is technology from V-Star that uses scripts to specify vocal code, character animation, and sets with animated props.|
Behind the scenes, journalists write the stories using news sources such as UPI, AP, City News Service, and the local magazine LA Weekly; comedy writers create scripts for the "silly" characters; and a staff of artists put the production together. Pro duction time is about one hour per minute of "on-air" time. "We're producing 30 to 40 new stories a day and be tween 60 and 80 shows a week right now," says Raaf. The content changes at least once a day, with sports and financial news changing twice a day. Breaking news, which can take as little as 15 minutes to produce, is updated throughout the day. Rather than trying to provide complete stories, 1KTV serves as a filter that uses the animated characters to bring people headline news plus a few paragraphs, and then provides links to news sources and other coverage on the Web for those who want to learn more.
Here's how it works. The animated characters are created with Discreet's (Montreal) 3D Studio Max and then converted, using proprietary software, into 2D sprite stacks. The sprite stacks, props, sets, and other images-more than 100mb of data in all-are sent to "viewers" on a CD that also contains the 1KTV player. During a broadcast, these images are accessed from the CD in real time to create the animated newscasters and their settings. To create the computer-generated, real-time voices, 1KTV uses an enhanced version of Compaq's (Houston) DECtalk software.
When someone clicks on a character, a script is downloaded. The script, a text file containing all the commands to run a show, activates the player and specifies vocal code, character animation, sets with animated props, and presentation graphics, if any. The selected character pops into a full-screen, 512 x 384-pixel, television-style virtual set. Even so, because the script is a text file, it requires a baud rate of only 1K per second to download (thus the 1KTV name). Since the images used for character animation, props, and so forth are on the viewer's CD, and don't have to be downloaded, there is almost no delay between click and news-even with standard dial-up services, according to the company.
|To create a broadcast script, computer-generated voice is specified using a phoneme-based system, and then emotions that control gestures and expressions are added. |
"Because we're using television as a model, we wanted to give people full-screen images, not the postage stamps you usually get with video on the Internet," says Raaf, explaining their decision to use animated characters and computer-generated voices rather than video and real voices. He adds, though, that they can use video in a newscast, if they choose. For example: "We can have [the animated character] Paula doing a story, and while she's talking, a video can be loading that will play on [an image of] a television set on her [virtual] desk," he says.
To create the characters and the show, the production team starts with 3D characters created in Studio Max. To create the animation, the company hires actors to perform the characters. The actors' facial expressions are captured with a Vicon (Oxford, UK) optical system; body movement is captured with a Motion Analysis (Santa Rosa, CA) optical system. The motion-capture data then is applied to the 3D character. While the animated character moves on-screen, a custom plug-in for Max takes "snapshots" of individual animation frames and saves those poses as 2D images.
Ultimately, each V-Star character will have hundreds of poses-facial expressions to match particular phonemes or convey emotions, individual elements in a walk cycle, all the positions necessary for a character to point at something or turn its head, and so forth. These poses become organized as hierarchical sprite stacks and are stored in character libraries on the CD along with the sets, props, and the 1KTV player. Currently the 1KTV disc has 25 characters and more than 100 sets.
|To create a newscaster's scripted facial expressions, sliders in a 3D Studio Max plug-in manipulate mocap data.|
A 1KTV production starts with a script. The script goes to a vocal coder who, using proprietary software, converts the text into phonemes and creates a phoneme script. The phoneme script controls an Active X plug-in to Internet Explorer that generates the voices on the fly. (The voice technology is based on a superset of the Klatt Syn thesizer, which was developed at MIT in the 1980s, licensed to the DEC division of Compaq, and updated at V-Star during the past five years.)
This phoneme script is sent to production artists, who link it to a character using a custom production software program that accesses the sprite stacks and provides automatic lip sync based on the phonemes. Once that's done, the artists drag and drop emotions into the script and specify actions such as where the characters look, sit, or point. This can be done on a frame-by-frame or second-by-second basis using simple icons and commands such as "serious" or "angry" or "turn head left" that call up particular sprite stacks. Finally, the production software creates the text-based command file that's posted to the Web site.
Even though the cartoon characters have so far proved more popular than the newscasters, people do respond to the animated humanoids, according to Raaf. "People are sending e-mail to these virtual characters," he says. "It's hilarious."
But then, why not? Ezra Shapiro, a professor of journalism at Cal State North ridge (Los Angeles) notes wryly: "The people who deliver television news are already close to cartoon figures. This software is taking us one small step forward, but why stop there? I'd rather see the evening news delivered by cartoon characters than humanoids. I'd love to see Daffy Duck covering Congress.
"I think that once you know it's an animation, it doesn't matter if it's a humanoid or not," he adds. I'd rather have an animated paperclip read ing the news than a stiff cartoon human."
During 1KTVLA's first two months, V-Star shipped 3500 CDs, and Shapiro has begun seeing the discs on campus. "We're working now to get more traffic," says Raaf. "When we get 10,000, we'll start advertising and launch our second site in San Fran cisco." Raaf hopes to have 10 regional sites in the next 18 months, and is looking for approximately 100,000 viewers for each site. This, of course, would allow advertisers to target specific audiences. "We can drop a 15-second spot into a show, and it would be unavoidable," Raaf says. "The user wouldn't have to click to see it." As with television, the only way the viewer could avoid the ad would be to leave the room-or the Web site.
V-Star is also licensing its tools to other Web sites interested in creating animated characters. And the company is creating animated versions of broadcast newscasters for a television station's Web site. For these virtual characters, V-Star has developed a system that sends full voice rather than computer-generated voice with an 8K data stream. "We still get automatic lip sync," Raaf says, "unless the download time dips below 8K." If the download time drops below 8K, the animated character begins speaking with a computer-generated voice to maintain the real-time lip syncing.
It could be the start of a very strange future in which the familiar noon news is replaced by 'toon news delivered by animated characters. It's one thing to joke about cartoon characters covering politicians, but it's hard to imagine an animated character with a computer-generated voice reporting on the war in Bosnia.
"You might as well use a cartoon to deliver news that's choreographed to begin with, but I don't know anyone who would want to hear about tragedies from animated characters," Shapiro says. "If an airplane crashes, you don't want to hear about it from a cartoon. The human touch is necessary, and no amount of animation could bridge that gap. Of course, as we get used to animated characters that may change, but now, when we turn to something for comfort, we want a warm-blooded animal."
Barbara Robertson is West Coast senior editor for Computer Graphics World.
Satiric CG Commentary: Le JourNul de Francois Perusse
By Karen Moltenbrey
A computer-generated segment shown on Canada's TVA evening news program is making headlines itself as the first animated series of its kind to be produced and aired on TV the same day, every day. Called "Le JourNul de Francois Perusse," the 60-second 3D spot, rendered with a 2D cel look, features a quirky digital anchorman and three virtual report ers who deliver satiric commentary about current affairs. Sometimes the characters appear in a 3D animated set, while at other times they are composited over a still image or a video clip.
"These reporters are where the action is, whether it's the sportscaster reporting from Canada's Molson Center or a reporter in Washington at the White House," says Christophe Goldberger, general manager of Montreal's Klik Animation studio, which produces the series.
|"Le JourNul" features reporter Tristan Direct, a 3D character rendered with a traditional cel look and animated with motion-capture tools. |
This innovative concept is television's equivalent of a daily newspaper comic strip such as "Doonesbury" that focuses on topical issues. The one-minute TV segment, though, requires about 1800 frames of animation, which must be written, scripted, and produced each weekday. When Klik Ani mation was approached by Zero Productions in Montreal, "we knew that no one at the time had ever produced a daily animated series on this scale, but we wanted to be the first," Gold ber ger says. In fact, 2D and 3D animation facilities alike had deemed the concept impractical.
Because of the project's scope and re quired turnaround time, Klik had ruled out using all traditional 2D cel or 3D keyframe animation. "We had been using motion capture for our other 3D character-animation projects, so we knew this technology would work for this series," says Goldberger.
"Le JourNul" is a dialog-driven segment, which requires the spot's creator, French radio personality Francois Perusse, to produce a sound track of the spot's audio the evening prior to the broadcast, which is then delivered overnight to the studio. Early the following morning, the Klik animators capture the body movements of an actor, using the StarTrak wireless motion-capture system by Polhemus (Colchester, VT), as he acts out the performances for each character involved in that day's piece. The information is processed in Kaydara's (Montreal) Filmbox software, and the files are transferred into Discreet's (Montreal) 3D Studio Max, which the artists use to create and animate the character models and model the digital sets.
For each character's hands and face, the artists use standard keyframing because, in Goldberger's opinion, this method works better and offers more control for achieving facial and hand movement in a short turnaround time. Keyframing is also used to correct motion-captured body movement that lacks the desired precision.
|In the daily TV news segment "Le JourNul," anchorman Sébastien Tobin and arts reporter Paula Rideau are digital characters placed in CG or live-action backgrounds.|
One of the biggest challenges to using motion capture for this project, says Goldberger, is that the movement is so smooth that it doesn't always resemble the motion of traditional cel animation, which the group is trying to imitate. As a result, the actor whose motions are captured has had to master certain physical comedic characteristics of cel animation, such as exaggerated body movements.
The colorful 2D cartoon quality of the 3D characters and background imagery is accomplished during the rendering process, using David Gould's Illustrate program, distributed by Digimation (St. Rose, LA). Editing of the spots is done with Adobe Systems' (San Jose, CA) Premiere.
"We conducted a tremendous amount of research in developing the cartoon-shading process that enables us to use powerful 3D character animation, but at the same time, apply that to characters who, in this case, have a more simplified cel appearance," says Goldberger.
Not long ago, producing a daily topical show of this scope using a traditional animation style was unthinkable. Now it is part of a daily routine.
Karen Moltenbrey is an associate editor for Computer Graphics World.