DIANA PHILLIPS MAHONEY
When it comes to creating a simple, effective 3D modeling interface, kids might be the best teachers. A group of technology researchers from Mitsubishi Electric Research Laboratory (MERL), Georgia Institute of Technology, and the University of Virginia believe child's play may hold the key to simplifying 3D modeling. Put a pile of Lego blocks in front of a group of kids, the researchers contend, and intuitively they connect and stack the interlocking blocks to make simple models, then use their imaginations to fill in the details. Modeling clay holds similar promise. A nondescript lump will soon become anything from a dinosaur to a soaring jet.
In contrast, with today's 3D modeling systems, intuition buckles under the burden of having to precisely specify the geometric and material properties of the models being created.
A structure consisting of 560 blocks embedded with computational devices provides the input for its rendered counterpart. A new 3D modeling system uses location and connectivity information communicated from the blocks to produce a geometric description of the scene, and then augments the scene by enhancing recognized features.
In an effort to infuse childlike simplicity into sophisticated 3D modeling tasks, the researchers have developed a toy-like modeling system driven by the concept of tangible interaction. The system relies on a tactile interface, such as building blocks and clay, that users manipulate to construct a base model. Graphical-interpretation techniques recognize the model's salient features and augment the structure with geometric details or animation.
"By using suitable computer technology, the geometry of building-block and clay models can be captured automatically," says MERL researcher Joe Marks. "In this way, everyday construction toys can be turned into input devices for creating 3D geometry."
Such tangible interaction, however, is limited to the physical constraints of the construction media. "It's hard to create detailed or animated 3D models easily out of building blocks or clay," says Marks, "so tangible interaction on its own is good for building coarse geometric models, but is useless for more sophisticated 3D modeling." To facilitate the latter, the new system employs automated interpretation techniques to make sense of the geometry and determine how to best augment it. "The system may interpret a child's building-block model as a house or a castle, then enhance it with extra geometry and textures. Likewise, a clay model of a dog or a bird can be brought to life automatically," says Marks.
The researchers have demonstrated the feasibility of this concept using both embedded computation and computer vision. To realize the building-block example, they embedded computational devices into Lego-like blocks. The standard block has eight plugs on the top and eight jacks on the bottom. The plugs and jacks are fitted with two conductors each: one for power distribution and one for bidirectional signals.
When the blocks are connected, communication among them and the host computer is achieved using asynchronous message passing. A geometry-determination algorithm computes the geometry of the fully assembled block structure and produces a geometric scene description that accounts for the presence and location of each block. The host computer uses pre-determined values associated with the elements as described to represent such attributes as shape, color, and texture.
The model descriptions serve as the input to a rule-based program that identifies architectural features of the recognized block structure. For example, in a block structure interpreted as a building, the system uses simple pattern-matching to identify such components as the walls and roof. It will then enhance the visual appearance of the model by adding appropriate geometry and surface detail.
The modeling-clay application relies on external sensors and computer vision, rather than embedded computing. The system uses a motorized rotary table, a digital camera, a laser striper, and the host computer to scan, recognize, interpret, and animate 3D clay models. Users first model the clay into the desired form, then place it on the rotary table. As the table rotates, the digital camera captures a sequence of images from which the computer creates a volumetric scan. Complex models may be difficult to capture with the digital camera alone, thus the laser striper can be used to refine the model.
To recognize and interpret the clay model, the system compares a set of parameterized object templates to the scanned model. The templates are deformed to match the model, and the model is classified based on the degree of deformation. Next, an interpretation step parses the model into its constituent parts. For example, if the system recognizes the model as a biped, the deformation classification identifies the head, arms, legs, and torso based on their relation to the best-matched template. The biped can then be animated using traditional techniques.
A clay model gets virtual. A digital camera and computer-vision techniques are used to scan, recognize, and graphically interpret a clay creature, which can then be animated with traditional tools. Shown here, from left: the clay model, its volumetric scan (a combination of serial digital images), its "best match" from a template library, component parts of the interpreted model, and a frame from the resulting animation.
The building-block and clay-based modeling systems are still in the prototype stage. Top on the ongoing research agenda is the development of alternative architectures for embedded computation. In its current guise, the building-block system requires a huge supply of power to recognize large block structures. The researchers are studying ways to capture a composed structure using the least possible active components in each block. They are also considering architectures that make use of a broadcast medium. And, to enable more user interaction, they are exploring modifications that would allow for incremental adjustment of the physical models. Finally, they are thinking about "mixed-initiative" systems that take better advantage of the automated algorithms by allowing them to learn from users' histories of choices and make suggestions based on those.
To date, the most important outcome of this work may be the message that a modeling shift is on the horizon. "You shouldn't have to master a hugely complex modeling system to create virtual worlds and the characters that inhabit them," says Marks. "If users are given the ability to easily create and animate complex 3D models, I'm sure developers will have no trouble producing fun and educational applications that have 3D modeling at their core."
Diana Phillips Mahoney is chief technology editor of Computer Graphics World.