Issue: Volume: 25 Issue: 1 (January 2002)

Single View Modeling



DIANA PHILLIPS MAHONEY

The human visual system is incredible not only because it allows us to see, but also because it allows us to perceive more in what we see than is actually there. When we look at a photograph or a painting, for example, not only do we see the 2D representation, but we also perceive the "hidden" 3D shape information by incorporating a variety of cues, including shading, texture, and focus.

Realizing that a digital incarnation of this impressive capability could have a significant impact on the creation of realistic computer-generated imagery, many graphics researchers have focused their efforts on developing algorithms to somehow mimic this aspect of the human perceptual system. The result has been the development of various computer vision-based techniques for single-view reconstruction of 3D models from a photographic scene. A particular hurdle, however, has been the re-creation of free-form surfaces with general reflectance properties.




"Some single-view methods are applicable only if the shape is planar or consists of primitive shapes, such as spheres and cylinders," says computer graphics researcher Li Zhang at the University of Washington in Seattle. "Other methods work by requiring users to manually specify the depth for every pixel, and yet others work only if the statistics of a target shape are known a priori."

To address this deficiency, Zhang and University of Washington colleague Steven Seitz, along with Guallaume Dugas-Phocion, and Jean-Sebastien Samson of the Ecole Polytechnique in France, have developed a novel system for reconstructing free-form 3D scene models with arbitrary reflectance properties from a single painting or photograph with no prior knowledge about the shape. The interactive technique up dates the model in real time as constraints are added, allowing fast reconstruction of photorealistic scene models.
To re-create free-form landscape data in 3D from a photo, University of Wash ington researchers employ a modeling system that calculates depth and reflectance properties and computes the surfaces at interactive rates




Most methods for free-form surface modeling focus on generating smooth artificial surfaces and thus yield images with an artificial appearance. The new technique takes as input a sparse set of user-specified constraints, including surface positions, normals, silhouettes, and creases, and generates a "well be haved" 3D surface satisfying the para meters. As each constraint is specified, the system recalculates and displays the reconstruction in real time.

The new system builds on previous work in hierarchical surface modeling. Basically, a scene is modeled as a piecemeal continuous surface represented on an adaptive grid, and it is computed using a wave let-based hierarchical transformation technique. Wave lets let a function be represented at a lower resolution by maintaining values, called detail coefficients, from the original dataset. These enable the original function to be regenerated without any loss of information, and the process can be repeated any number of times.

In this application, the wavelets are used to optimize the surface representation by explicitly modeling discontinuities within the hierarchical organization on the quad-tree-based adaptive grid and to enable the computation of surfaces at interactive rates.

The new technique has many ad vantages over existing free-form surface reconstruction methods. Basically, says Zhang, "it's simple, fast, flexible, and fun to use. It doesn't require a library of primitive shapes or similar shapes, and it is applicable to paintings as well as photos."
The landscape scene from the previous page is modeled as a continuous surface represented on an adaptive grid (above). The single-view modeling system automatically calculates the depth of the image based on user-specified constraints and renders it as a




The user may specify any combination of point, curve, and region parameters as image-based constraints on the re-created surface, and with the hierarchical transformation approach, the resulting 3D reconstruction can be achieved at interactive rates, "so users can modify and appreciate their models immediately."

In addition, the underlying adaptive grid adjusts automatically to the complexity of the scene. For example, says Zhang, "the quad-tree representation can be made more detailed around contours and regions of high curvature."

There are some disadvantages to the new system. A single image provides only limited information, thus the reconstructed surfaces can have distracting holes near parts of the surface that were not visible in the 2D image. Because of this, the re searchers are considering the possibility of incorporating automatic hole-filling techniques. In addition, says Zhang, "the accuracy of the final surfaces depends on human observation, and the results for a given image may vary from user to user." For instance, he says, "different people perceive shape differently. One person will interpret an object in a picture as being deeper than it really is, while another will perceive surfaces as having less relief."

Typically, the single-view reconstruction technique is best suited for fast surface modeling from a single photograph or painting, where multiple images from different views are not available, and when good models of the target objects do not exist. The method is not suited for highly accurate surface reconstruction.

In addition to the possibility of adding hole-filling capabilities, the researchers are considering another extension, which would generalize to perspective projection as well as other useful projection models, such as panoramas. Another future goal, says Zhang, is to recover the surface reflectance of the reconstructed shape.

With the prototype system complete, Zhang foresees that the new technique could serve as an important enhancement, in the form of a plug-in to programs such as Photoshop, to provide an easy way to create 3D models from photographs. In addition, he says, "the approach could also be used for special effects, virtual sets, and compositing live action with CG imagery."

More information on the single view modeling system can be found at http://grail.cs.washington.edu/projects/svm.