As GPU's begin to rival CPU's, they are assuming a much larger role in the compute process.
Graphics processors have come a long way during the past few decades. Over the years, these cards have added greater and greater processing power to handle ever more sophisticated graphics. It has now gotten to the point where the graphics card can actually possess more computing power than the system’s CPU. This increase in power is causing many people to think twice about how the graphics card is actually used. The graphics processor has now become a co-processor in the system, allowing for a much wider range of applications.
At first, graphics cards were nothing more than display adapters, leaving all the number crunching to the CPU. As processing power was added to these cards and the clock speeds of chips became faster, so, too, did the speed of the workstation’s graphics. There came a time, however, when faster clock speeds simply weren’t enough, and companies started adding more processors to distribute the load. Graphics are particularly well suited for parallel processing, since different parts of a display can be handled with discrete processors.
Parallel processing has been so successful that more recent graphics cards can now sport GPUs with hundreds of processors each. While these processors are primarily tuned for graphics applications, they are still processors, and like every processor, they can crunch numbers. In fact, the processors on graphics cards are especially good at floating-point math, which makes them very suitable for tasks like simulation and other math-intensive applications.
Graphics processors have come a long way over the past few years, taking on more processing tasks.
More Power, More WorkThe first applications to use this processing power for work other than graphics were the gaming companies. They moved complex calculations, such as particle systems and physics, to the graphics card, which allowed for faster games with richer content. The fact that this processing happened on the graphics card saved the CPU from getting bogged down and also saved bandwidth on the graphics interface.
Soon, a number of industries were using the power of graphics cards to augment their applications, and graphics card vendors were quick to offer tools to help these developers tap into the power of their cards. These days, graphics cards can be used as parallel computing platforms, performing such tasks as physical simulation, finance, fluid dynamics, and particle-system generation. The line where the computer stops and the graphics card begins has become very blurred, indeed. In fact, a whole new term has been coined for this: general-purpose graphics processing units, or GPGPUs.
Nvidia has jumped headfirst into the GPGPU pool, and actually sells a general-purpose supercomputer called Tesla. This computer is essentially one of Nvidia’s top-of-the-line graphics cards stripped of the display connectors and configured as either an upgrade card or an external box. The high-end version, dubbed the S1070, is a rack-mount system that offers its users four teraflops of computing power—several hundred times that of a standard PC. Many of these can be connected together to create a very powerful supercomputer.
In fact, just a few years ago, Nvidia claimed that four of its S1070s connected together would have enough computing power to qualify it as one of the top 100 supercomputers. These days, who knows how it would stack up, but it still represents an enormous amount of computing power for the desktop.
With graphics cards encroaching on territory formerly the domain of systems vendors, there is also movement in the other direction. Intel, famously known for its x86 CPU platform, which drives the vast majority of desktops, is no stranger to graphics. In the consumer space, Intel does very well with its integrated graphics chips. These chips are great for the average home user, but those who need higher-end graphics, such as gamers and content creators, will usually add a third-party card from Nvidia or ATI to boost graphics performance.
A New Game?
Developers in markets such as medical, bioscience, finance, oil and gas, and others can build programs using standard C language to run on Nvidia’s graphics processors via the company’s CUDA development environment
Intel is trying to take on the major graphics players not by competing directly with them, but by subtly changing the rules of the game. This will happen when Intel’s new Larrabee chipset is set to ship at the end of this year. The chipset is designed primarily as a high-end graphics card, but with a twist. The Larrabee chip takes a completely different approach than traditional graphics cards in that it doesn’t use a dedicated graphics processor, but instead uses a massively parallel version of Intel’s x86 chip geared toward graphics.
Larrabee will include very little specialized graphics hardware, instead performing tasks like z-buffering, clipping, and blending in software—using a tile-based rendering approach. This simple method could change the rules as to how high-end graphics are developed, because such processes as rendering will be left to software rather than the hardware traditionally used on graphics cards. This means developers would be free to customize the way an application renders its images, allowing for a much wider palette in the way the graphics look.
Having x86-based processors on the graphics card also hits directly at the desktop supercomputer paradigm put forth by Nvidia. The problem with a dedicated graphics processor is that it requires a different set of programming tools. Learning the skills to master these tools takes an extra degree of time and effort, something that can be in short supply within a software development environment.
By using the very common x86 command set, Intel opens up sophisticated graphics to far more developers. It also could open up the graphics chip as a direct extension of the system’s CPU. Since they both speak the same language, the same code could conceivably be shared between the CPU and the GPU, thus speeding up just about any application in the computer. Of course, all of this will be dependent on the actual chips that Intel produces. The performance of a Larrabee chip has yet to be formally measured, so who knows how well it will stack up against next year’s top-of-the-line chips from AMD and Nvidia.
To open up its graphics chips to a wider range of developers, Nvidia launched the CUDA development environment. This allows developers to build programs that run on Nvidia’s graphics processors using standard C-language tools instead of a more esoteric graphics language. While this can address some developer concerns, it still might not offer the degree of integration that Larrabee would provide.
Not to be left out of this burgeoning battle is AMD, the other major x86 chip vendor. The company’s recent acquisition of ATI was part of its plan to integrate the CPU with graphics on the same chip. This could boost throughput significantly, since it eliminates one of the biggest bottlenecks in a graphics computer: the graphics interface bus. As graphics get faster and faster, the standard bus has to be continually updated.
From the original ISA bus to PCI, AGP, and now PCI Express, every few years the graphics bus changes. An integrated chip would sidestep this issue nicely. Integrated chips also will reduce power requirements, making such a chip ideal for the growing market focused on mobile devices, such as netbooks and tablets.
At this point, however, AMD has not shipped any integrated chips. It is, however integrating the company’s graphics chips more tightly with its CPUs on the motherboard, allowing for increased performance. For developers, AMD offers its Stream computing platform, which provides a development environment that is friendly to the average programmer.
AcrSoft Total Media Theater, a consumer video app, gets a quality boost (right) when accelerated with AMD’s ATI Stream.
The CPU/GPU MarriageAll these developments represent a convergence of the GPU and CPU that could easily change the way graphics are managed within a computer system. Right now, it seems as though the two CPU manufacturers are lining up their graphics solutions, which could leave Nvidia as the only high-end graphics vendor without an x86 chip of its own.
This may suit Nvidia just fine, as the company seems to be publicly pushing the GPGPU concept as a replacement to the CPU. In response to an ongoing court battle with Intel over a 2004 cross-licensing deal, Nvidia’s CEO Jen-Hsun Huang stated, “At the heart of this issue is that the CPU has run its course and the soul of the PC is shifting quickly to the GPU. This is clearly an attempt to stifle innovation to protect a decaying CPU business.”
While this comment should be taken with a big dose of CEO hubris, there is still some technology to back up this statement. One of the most important is software-based. Last year, Apple, Nvidia, and AMD helped create a new standard called Open Computer Language (OpenCL), which is a framework for writing programs that execute across a wide array of processing devices, including GPUs. OpenCL offers parallel programming tools and could become a very common standard like its cousin, OpenGL.
An open programming standard is nice, but it does not entirely replace the standard x86 CPU in a computer system that most likely runs an operating system written for the x86. This brings up the ongoing rumors that Nvidia also may be looking to enter the x86 market. The company has already dipped its toe into the integrated CPU/GPU chip market with the Tegra chips for mobile devices, which integrate Nvidia graphics with an ARM processor. Integrating an x86 CPU may be the next step, most likely for the low-power netbook market. Where Nvidia gets that technology, however, is up for debate.
This leaves the graphics industry at a turning point. There will be increasing pressure to integrate the CPU with the GPU, as well as open up the processing power of graphics cards to more general applications. How this will all play out is up for speculation, but it will most certainly result in not only faster graphics, but also in better applications and software that can tap into the power contained in these chips.
George Maestri is a contributing editor for Computer Graphics World and president/CEO of RubberBug animation studio. He can be reached at firstname.lastname@example.org.