Diana Phillips Mahoney
The rapid, substantial advances in computational and graphics performance over the past several years have allowed researchers in countless application areas to gain unique perspectives on their numerical and statistical data through the use of specialized information-visualization techniques.
Recently, the driving force behind these techniques-high-performance computing-has be gun to reap the technology's benefits as well. Numerous re searchers are exploring the use of visualization capabilities to manage and manipulate the massive datasets representative of various computer systems' operations. An innovative effort in this regard is a powerful, general-purpose computer-systems visualization and analysis framework under development at Stanford University.
Called Rivet, the unique visualization environment was born out of a collaboration between the university's computer graphics and computer systems groups, whose driving objective has been to develop tools for understanding complex computer systems while also furthering the state of the art in information visualization. Succeeding on both fronts, Rivet enables the rapid development of interactive visualizations for studying a range of data-intensive computer systems components, including operating systems, processor and memory systems, compilers, multiprocessing architectures, and network technologies.
|A Rivet visualization analyzing mobile-network usage in the San Francisco Bay area relies on visual metaphors to show both user-mobility patterns and usage patterns over time (inset). The mobility view uses four scatter plots of the same dataset, each wit|
The need for such technology is becoming particularly acute as today's increasingly complex com puter systems outgrow traditional data-analysis methods. "Com puter-systems data has typically either been analyzed using statistics, which can obscure interesting behavior by aggregating large amounts of data into a single measure, or by manually searching through large text files, which can easily leave analysts lost in the details," says Rivet principal researcher Robert Bosch.
In contrast, by combining elements of scientific visualization, human-computer interaction, data mining, imaging, and graphics, information visualization is able to transform abstract, voluminous data that often has no obvious physical representation into understandable pictures. "Visualization exploits the high perceptual bandwidth and pattern-recognition capabilities of the human visual system, enabling analysts to explore large data sets and find the information that is of particular interest to them," Bosch notes.
The computer-systems challenge, however, can bring existing information-visualization techniques to their knees because, says Bosch, many of these tools have been developed to handle very specific problems and are not easily adaptable to broader applications. Rivet is intended to serve as a single, cohesive, general-purpose computer-systems visualization environment. Bosch and colleagues Chris Stolte and Diane Tang, under the direction of professors Pat Han rahan, Mendel Rosenblum, and Mary Baker, are developing the system in the context of a range of diverse real-world problems, such as the study of application behavior on superscalar processors and the development of a performance analysis system for evaluating parallel applications.
Rivet is built on the premise that a single data set and visualization is often not sufficient for answering some of the complex questions that computer-systems analyses invite. Instead, says Bosch, "a given data visualization often sug gests a new set of data to be collected and incorporated into the visualization." Thus, he says, "one of our goals for Rivet was to support an iterative and integrated analysis and visualization process. Consequently, we have focused on supporting rapid prototyping and incremental development of visualizations."
To facilitate this approach, the re searchers employ a "compositional" or modular architecture made up of individual visualization building blocks and interfaces that users can define and assemble in whichever way best meets the needs of their specific applications. Because in most cases it's impossible to visually represent both the huge number of entities involved in a specific analysis as well as the events to which the entities are subjected (hundreds of processors engaged in millions or billions of transactions, for example), Rivet relies on a system of visual "metaphors" to capture the essence of the data.
|In the San Francisco Bay mobile-network visualization, a different visual metaphor provides a window into overall network statistics. The inset detail focuses on a particular region of interest. Using the control panel, a user can dynamically select the n|
At Rivet's core are data elements called tuples, which are unordered collections of data attributes conceptually similar to a row of data in a spreadsheet table. Each tuple contains information about the entity being analyzed. If it's a processor, for example, the tuple might contain such information as the nature of the transaction it's engaged in and the degree to which it's being utilized at a given time step. Tuples with a common format are grouped into tables, accompanied by metadata describing the tuple contents. This organization allows the same dataset to be visualized in many different ways. For example, an analyst might want to evaluate a processor's activity level at a given point in time or its performance relative to other processors.
Once organized into tables, the data can be passed through a transformation network to perform such standard relational-database operations as sorting, filtering, aggregation, grouping, and merging. Rivet also lets users incorporate their own operations, such as clustering and data-mining algorithms. The transformed data is then mapped to a visual representation using graphical meta phors that depict the data tables and the individual tuples using primitive shapes.
The metaphors rely on numerous visualization attributes-ranging from simple luminance variations to full-color animations-and support multiple levels of detail and interactivity to optimize the data representation. The animation capabilities are particularly useful, notes Bosch, because they provide "a relatively natural means of representing the evolution of systems over time." For example, in the visualization environment designed to study application behavior on superscalar processors, a "pipeline" view animates instructions as they traverse the stages of pipeline utilization. Additionally, says Bosch, "during interactive navigation of the data, animation helps preserve the user's context and prevent disorientation." In the superscalar application, the animated pipe line behavior is correlated to specific regions of interest identified in the timeline view, which displays pipeline utilization and occupancy statistics for the entire period of study.
The key to Rivet's success is its ability to provide the flexibility necessary to enable rapid prototyping of visualizations for exploring complex, real-world problems without sacrificing high-performance graphics and support for large datasets. This is achieved through its reliance on both compiled and interpreted programming languages. "All of our basic building blocks are written in C++ and OpenGL. The interfaces to these objects are exported to Tcl or Perl [standard scripting languages], enabling visualizations to be rapidly assembled and modified using scripts," says Bosch. "Once the visualizations are assembled, the interpreter is out of the main loop, providing a good mix of performance and flexibility."
One of the ongoing challenges the researchers face is finding the optimal decomposition of visualizations into simple building blocks. "Our collection of visualization building blocks has evolved as we have had more experience with the system," says Bosch. "Early on, we discovered that it was critical that the visual components be distinct from the data components. Recently, we have further decomposed the visual and data components to an even finer level to provide more flexibility in how we compose and create visualizations.
The Stanford researchers consider Rivet a work-in-progress. Upcoming research efforts include expanding the environment's data-management capabilities. "We want to support even larger datasets," says Bosch. "We can currently handle hundreds of megabytes of data, but many interesting datasets are much larger."
Much of the Rivet researchers' attention of late has been focused on validating the system through its use in individual case studies, such as the superscalar processor study. The system has also been successfully applied in studies of parallel applications, memory systems, and wireless networks. In addition, the re searchers are using Rivet to develop two more general visualization frameworks. One, called Polaris, is an environment for the visualization of high-dimensional relational data. The second, called DataGrove, is an interface for hierarchically structured data.
The application opportunities for Rivet are vast. And as long as computer systems continue to grow in complexity, so will the application space of Rivet.Diana Phillips Mahoney is chief technology editor of Computer Graphics World.