Issue: Volume: 24 Issue: 1 (January 2001)

Pictures, in Parallel



DIANA PHILLIPS MAHONEY

The mega advances that have defined computational processing developments over the past several years have enabled scientists to simulate extremely large, time-varying datasets in a matter of hours, versus the days and weeks that were required in years past. Today researchers are resolving models of real-world phenomena ranging from the Earth's climate and oceanic activity to accelerator-physics dynamics by running simulations in parallel on clusters of high-bandwidth supercomputers and PCs. Because of these capabilities, as well as significant algorithmic advances, the simulation models being crunched are able to achieve an unprecedented level of detail resolution.

The solutions are so large and complex, however, that similarly powerful computational resources are needed to visualize and analyze the resulting datasets. Unfortunately, general-purpose parallel visualization programs are neither easy to come by nor easy to implement. Typically, such programs require highly specified programming knowledge beyond the ken of the scientists running the simulations. In addition, when they are programmed, the parallel systems tend to be "one-offs," meaning they are developed to visualize a specific simulation but cannot be generalized to other applications.

In an effort to make parallel visualization accessible to a broader range of users, researchers at Los Alamos National Laboratory, Kitware Inc., and Argonne National Laboratory have built a scalable, full-function system on top of an existing, open-source toolkit that allows users to quickly write custom parallel visualization programs that they can then port across platforms without having to rewrite the code. Called the Parallel Visualization Toolkit, the new system uses existing software, including the open-source Visualization Toolkit (VTK) developed by researchers at Kitware and GE Corporate R&D Center, as well as OpenGL and Mesa APIs.
The visualization of complex dynamic datasets, such as this sequence of ocean salinity measurements (colored by temperature) was achieved with the Parallel Visualization Toolkit. Low-level parallel programming algorithms were abstracted into higher-level




The researchers chose VTK as their starting point because the toolkit contains a broad array of visualization, graphics, and imaging algorithms and is portable to a variety of hardware platforms and operating systems. Their goal was to support most of the VTK functionality by offering parallel versions of the visualization algorithms. "Supporting a full range of parallel-visualization algorithms is critical to effectively processing large datasets, since the alternative-interspersing serial algorithms with parallel algorithms-can significantly degrade performance," says LANL researcher James Ahrens.

VTK's multi-platform support was also a deciding factor. "Portability is critical to users with access to heterogeneous platforms, because platform availability can change due to crashes, maintenance, purchases, and removal. If the user's visualization system is portable, then he or she can choose the best available platform, instead of being constrained only to the availability of a specific platform," says Ahrens. Thus, a fundamental objective was to create a system that would be portable between platforms with different operating systems and underlying hardware, and between shared and distributed memory multiprocessors. This was achieved with a programming mechanism through which the system automatically abstracts whether the process being implemented uses a distributed or shared-memory approach. Once this environment was established, the researchers abstracted the complex parallel computing details to simplify the creation of parallel visualization programs.
Isosurfaces of a fluid-dynamics visualization are clipped to reveal internal structures. Because of the complexity of the simulation data, serial visualization techniques preclude easy manipulation of the visual information. The framework of the new paral




The resulting framework provides a reusable infrastructure for parallel and distributed visualization to solve terascale visualization programs. It does so by encapsulating low-level parallel programming algorithms into higher-level modules that can be configured to fully exploit available computational resources. The system is designed to take advantage of only as many such resources as are needed to efficiently process large datasets. In addition, it incorporates modular support for various types of parallelism, depending on the nature of the task at hand. For example, to get optimal performance in processing huge datasets, it supports data parallelism, in which the data is partitioned into independent subsets that are processed in parallel. To achieve performance gains on a long time-varying series, a pipeline approach can be employed, whereby a sequence of algorithms executes in parallel on different data elements. And, to most efficiently solve problems with many independent branches, such as those involving numerous parametric analyses, task parallelism can be used to partition the overall problem into separate tasks and automatically allocate the tasks to specific processors.

The critical distinguishing feature between the LANL system and other systems that attempt to support parallel visualization is that the latter almost exclusively rely on what's called a centralized executive, which refers to the form of administration that controls when and from where commands originate. With a centralized executive, a single processing unit dictates the execution commands to all of the others in the configuration. "Designing an efficient mechanism for controlling a large number of processes from a single, centralized executive is difficult," says Ahrens. In contrast, with the LANL system, the framework is such that the individual processors can operate autonomously or together depending on the needs of the application or the computational resources.

The system also maintains VTK's support for a number of automatic services, including the ability to share data between modules that reside in different processes, the ability to run a program persistently (as can be done using an event loop in a serial program), and the ability to interactively modify module parameters of an executing program. The latter, says Ahrens, while commonplace in a serial approach, is particularly difficult to achieve in a parallel program. For example, he says, "the user may want to interactively change an isosurface value. In a serial program, this is accomplished by invoking a change within a module. In a parallel program, modifying module parameters is more complex because modules reside in different processes." To achieve it, the system process object provides a service through which each process registers methods that are assigned unique tags. When a program is in a persistent state, a remote process with these tags can invoke the respective methods.

Also critical to the success of the parallel system is its visualization functionality. The researchers have developed a series of oceanic and galactic simulation visualizations using task, pipe line, and data parallel methods running both on PC and SGI Origin 2000 clusters that demonstrate the ability to visualize complex phenomena using a range of flow techniques, particles, and isosurfaces. In addition, the system is able to implement a number of parallel visualization algorithms for such tasks as cutting, clipping, probing, smoothing, thresholding, segmentation, and morphology.
A galactic dynamics simulation results in a hugely complex dataset, the visualization of which is best handled by distributing the computational load using the Parallel Visualization Toolkit. In this image, star positions are colored by energy, using a co




On the agenda for the parallel toolkit are a number of enhancements, including such efficiency-boosting techniques as load balancing and automatic program construction. More information can be found at www.acl.lanl.gov/viz/frameworks.html.

Diana Phillips Mahoney is chief technology editor of Computer Graphics World.