Issue: Volume: 24 Issue: 5 (May 2001)

Big Pictures, Little Packages



By Diana Phillips Mahoney

The challenge of compressing large or complex 3D digital models into a package small enough to be delivered quickly, efficiently, and unscathed across networks is easy to conceptualize.

In the everyday world, people are constantly trying to squeeze big things into small spaces. Consider the hapless airline passenger struggling to get a much-too-big carry-on bag into the much-too-small overhead compartment. The luggage will be squeezed, squished, turned upside down and right side left, its contents will be reconfigured, and some of its items will be removed. Ultimately, the passenger might hear the triumphant click of the hatch, but such success comes with a price. Not only might the duffel be horribly misshapen, the contents will likely be wrinkled, cracked, or otherwise traumatized, and the effort itself will have caused a bottleneck of passengers.




The same can be said for 3D models. There are many ways to manipulate the digital data for successful transmission across networks constrained by today's bandwidth limitations.

For example, information can be removed from the original file in order to reduce its size, the framework of the geometry itself can be altered to approximate the original shape using fewer bits of data, or the entire model can be delivered in progressive stages to prevent network bottlenecks.

However, each option has some undesirable consequences. Either information is lost and must be re-created on the receiving end or the shape has been somehow corrupted by geometry alterations or the delivery time for the fully detailed model is too slow.

The compression problem is exacerbated by the fact that advances in desktop graphics capabilities have far outpaced gains in network capacity. So while the models being created are getting ever bigger and more complex, the delivery channels they're being shoved through are not expanding proportionally. Similarly, until recently, available compression technologies have concentrated on small polygonal models, so they've been ill-equipped to handle the higher-quality models being generated even on low-end PCs. Instead, they result in large files and long download times before the first image can be rendered.
Before hopping across the Web, this bunny is encoded with a mesh-based compression algorithm developed by researchers at Georgia Tech. The structure of the mesh is used to reduce the number of bits needed to represent the vertices and polygons, ultimately




Thus, the challenge facing researchers and 3D graphics vendors is to devise techniques that will enable the storage and transmission of these huge, complex geometric datasets using the fewest possible bits without loss of quality. Generally, this comes down to understanding the best way to prepare a model for compression and optimizing the tradeoffs among such factors as an object's geometry, attributes such as color and texture, transmission speed, and processing and storage requirements.

The potential payoff for successfully meeting this challenge is immeasurable, with beneficiaries spanning the 3D graphics spectrum. "It could supercharge 3D applications found today at the high-end of manufacturing and film making and could unlock the potential of high-end 3D on consumer systems," according to compression researcher Wim Sweldens of Bell Labs' Mathematical Sciences Research Center.

In the entertainment sector, for example, "imagine a multiplayer, Internet-based video game that looks as good as Toy Story, or networked avatar/fantasy-based environments with the same display quality as movies," says Peter Schroder, a computer science professor at CalTech who has collaborated with Sweldens on the development of new 3D compression technologies. In manufacturing, he adds, "companies could use geometric representations when they put out requests for parts, use geometry to guide fabrication equipment, and compare scans of newly made parts to the original designs."
Harley-Davidson used Virtue3D software to compress a model of a basket unloader assembly for delivery over the Internet. The top image is a 2.2MB VRML representation of the model. The bottom image shows the Virtue3D compressed version, at 40K.




Consumer applications could also run with the technology. The real estate market is an example. "Today, someone selling a house puts pictures of all the rooms on the Web and perhaps a video walkthrough," says Sweldens. "When geometry processing reaches the desktop, in software akin to today's digital photo and video editors, you will not only be able to see any view of any room in the house, but you'll also be able to see how it will look after you knock out a wall, repaint the rooms, and drop in new furniture from a 3D catalog."

Perhaps most exciting are the applications not yet fathomed. "Almost everyone on the Internet has used data compression to get interesting data faster: ZIP for text and software, MP3 for music, JPEG and GIF for images, and MPEG and Quicktime for movies," says Davis King, a 3D compression researcher at Georgia Institute of Technology and creator of 3Dcompression.com, a clearinghouse for information on compressing 3D graphics and other complex datasets.

"In some cases, the availability of these compression formats has created new forms of entertainment and new business models, such as mp3.com, Napster, Internet-based radio stations, and Internet short films," King notes. The same could happen, he believes, with 3D geometry compression.

Driven by this goal, researchers and vendors are striving both to improve available 3D compression technologies and devise new ones. For the most part, existing 3D compression algorithms employ techniques adapted from their 2D predecessors, such as wavelets, which process data at different scales or resolutions, or they use techniques specifically designed to take advantage of the properties of 3D surfaces.

Several 3D compression tools are already commercially available, including Sun's Java3D compression standard, IBM's HotMedia based on its MPEG4/ Topological Surgery standard, compression software from Israel-based Virtue 3D, and techniques included in Intel's 3D software and Microsoft's DirectX. However, says King, "3D compression remains an active area of research because many 3D models are still too large to be used efficiently with currently available methods, and because no one knows how much further 3D compression may be improved."
A 3D virtual replica of a French antique is compressed for digital travel using a wavelet-based progressive compression scheme developed by researchers at Bell Labs and Caltech. Laser scanning of the real 10cm object produced a digital geometry file conta




Ongoing 3D compression research focuses both on hardware and software techniques. The goal of the hardware techniques is to speed up the transmission of 3D data from the CPU to the graphics processor. On the software side, the objective is to enable compact storage and network transmission.

Because the software approaches are being designed to accommodate conventional graphics hardware, it is generally expected that these will have more of an impact sooner on commercial applications.

Current software-based 3D compression methods generally fall into one of three categories: mesh-based techniques that use the structure of the object mesh to reduce the number of bits needed to represent the composite vertices and polygons while trying to maintain the original geometry; progressive compression techniques, which use simplification algorithms to generate a hierarchy of levels of detail that are transmitted from coarse to fine; and image-based techniques that encode a set of pictures rather than an object.

The benefits of each of these approaches are counterbalanced by specific disadvantages. For example, mesh-based techniques, such as those available in Java 3D and the Virtue3D package, can reduce the file size of almost any polygonal model, and once it arrives at its destination, the model is user-ready. This approach is the fastest and the most efficient in terms of bit rate if the user knows beforehand the level of detail needed at the end.

"Mesh-based methods are effective for applications that need consistency more than flexibility," says King. "For example, engineers or radiologists sharing 3D models may have to base their opinions on the original lossless data, and car dealers or artists displaying their wares on the Internet may want to choose a single level of detail instead of letting some users see higher or lower quality representations."

Unfortunately, mesh-based techniques, because they result in single-resolution images, are not scaleable, so they do not adapt to the available bandwidth. Thus, the approach becomes increasingly less efficient as the size of the geometry increases.

Progressive methods, on the other hand, support multiple levels of detail, which make them appealing for large models. A small approximation of the model is available immediately, followed by increasingly detailed information. This approach is also useful for applications in which multiple levels of object detail are required, such as virtual walkthroughs, in which display resolution increases and decreases depending on the navigator's proximity to a given object.

On the down side, progressive techniques tend to be slower because the initial model takes longer to construct, as it requires computing a progressive hierarchy for the entire model ahead of time. Additionally, although the recipient can get a visual approximation of the model immediately, the "real" thing isn't readily available.

In terms of commercial availability, Intel has developed a progressive compression software library that the company is licensing to software vendors. Macromedia, one of the first licensees of the Intel technology, is incorporating the 3D support into its Shockwave player.

A number of progressive compression algorithms are being developed and tested in the research community as well. A particularly promising effort is a wavelet-based technology developed by Sweldens and Schroder that has achieved significant efficiency gains over previously published 3D compression methods. The technique is currently being enhanced and its commercial viability assessed, according to Bell Labs.

Finally, image-based techniques, such as those in QuickTime VR and IPIX, have the advantage of being able to leverage existing image-compression infrastructure because they utilize 2D photographic data. To date, work on this type of compression has been limited to specialized applications, such as real-estate walkthrough programs and software for viewing MRI datasets. While the approach is not useful for applications requiring actual geometry, such as in manufacturing, says King, "it does have promise for applications in which high-quality images are more important than using accurate geometry."

Before 3D compression in any of its guises can significantly impact either high-end or mainstream applications, a number of challenges have to be met. The first among these is a theoretical hurdle. "Improvement in bit rates is ongoing, so no one really knows what bit rates are possible, what we should shoot for," says King. "When 1D [sound] and 2D compression were in the works, people already had access to good information theory about 1D and 2D signals. They knew right away what the maximum achievable compression gains were, and they knew what the penalties were. We don't have that prior knowledge for 3D."

On the practical side, some issues of scalability remain unresolved. "We need geometry compression formats that adapt gracefully to the available bandwidth," says Sweldens. "It has to be possible to have a single file and transmission format to accommodate fat pipes [Internet backbone, corporate T1 connections, very high-speed home access] and thin pipes [standard modems, wireless networks, cell phones], and each user should be able to get the best quality approximation of the original geometry given the available bandwidth on the way to the given user."

In addition, the compression scheme should be scaleable with respect to end-user de vices. In this regard, a given single format should be equally suited to devices with little compute and graphics power and those with powerful CPUs and graphics hardware.

"A content provider should not have to worry at the encoding stage what the ultimate destination de vice capabilities are," says Sweldens. For example, in networked game play, "content should be displayed optimally on systems ranging from PS2 style machines to low-end PCs or even portable networked gaming devices."

While it's conceivable that both of these scalability issues can be addressed using progressive compression, a more difficult challenge is scalability in terms of model complexity. "There are a number of algorithms that can deal more or less well with individual objects or a few objects, but there are few that can deal well with complex scenes containing many individual objects," says Schroder.

Another area ripe for attention is the compression of textured models. "At the moment, the models and textured coordinates are compressed using 3D compression techniques, and the images are compressed independently using [2D] image-compression techniques. It is not unusual to see models where the compressed geometry is less than one tenth the size of the compressed image," says Gabriel Taubin of IBM Research, co-developer of the MPEG-4 standard and associated 3D compression techniques, which include both single-resolution and progressive algorithms. Ideally, he suggests, there should be a greater balance between the mesh compression and the compression of texture coordinates, colors, and other attribute data.

Among the most technically intimidating compression challenges is the thought of handling dynamic scenes. "Most compression techniques proposed so far deal with static objects," says Taubin. "One challenge is to develop efficient compression techniques for animations, and in particular those of complex scenes where objects appear and disappear and may deform over time, changing not only their geometry, but also their topology." In fact, he contends, one of the applications that stands to gain the most from 3D compression is its use as a video compression technique that would allow the continuous streaming of high-volume 3D data for Internet and television applications.

Finally, in order to satisfy the compression needs of the scientific visualization community, says Taubin, "techniques must be developed that can handle the compression, transmission, and visualization of very large models, such as those generated in supercomputers as the result of large-scale physical or biological simulations." In such cases, the models exceed typical memory capacity of client terminals, thus a two-way communication protocol is needed to synchronize the data in both the terminal and the server.
Mesh-based compression methods start with a single triangle and spiral around it to traverse the entire mesh. This view of the bunny shows the beginning of a typical encoding process using a compression algorithm called the Edgebreaker out of Georgia Tech




As solutions to these problems begin to emerge, the sky's the limit for application potential. "Currently, the main use [of 3D compression] is to allow existing applications such as e-commerce, collaborative CAD/CAM, video games, and medical visualization to use larger and more complex models over the Internet than they can use without compression," says King. "The goal of the field, however, is much larger: to allow new forms of art, business, and entertainment to emerge on the 3D Internet." Among the possibilities are 3D photographs, telepresence in virtual worlds, 3D product showrooms, and real-time scientific simulations.

And once the compression needs of such ambitious applications are met, perhaps researchers could try to figure out how to apply their digital magic to the real world so we could all get bigger suitcases into those cramped overhead compartments.

Diana Phillips Mahoney is chief technology editor of Computer Graphics World.