Power Management in Scientific Computing

If you have not yet experienced power and cooling problems, chances are you soon will

Power management is rapidly becoming a critical issue in scientific computing. From the desktop to the cluster, the computing power available to solve scientific problems
Figure 1: Advance boards are designed to accelerate floating point BLAS, LAPACK and FFT libraries, much as graphics cards accelerate the Direct3D, and OpenGL graphics libraries. 
will soon be limited primarily by the amount of power the system needs and the cost of that energy. Indeed, as energy costs continue to rise, the running costs of a system during its lifetime will come to dominate a system's total cost of ownership more than the equipment's initial purchase price. This article investigates technology trends that have lead to this power problem, explores the impact of power consumption on the scientific community and reviews some proposed solutions.
Trends for computing in science
Scientists rely on all sorts of systems to compute the solutions to the problems they are investigating. From laptops through PCs and workstations, onto the now ubiquitous Linux cluster, and finishing at the supercomputer — today's scientist often has access to multiple compute resources.

One significant trend that has affected the performance available to scientists has been the rise of commodity processors and systems. These have been driven by the consumer space, enabled by the vendors' abilities to exploit improvements in silicon process technology to rapidly increase performance while reducing cost. The scientific community has benefited enormously from this trend, with available computing performance doubling approximately every 18 months to two years — a trend commonly referred to as Moore's Law.

However, another trend has accompanied the exponential performance growth afforded by Moore's Law — increasing power consumption. As processors and systems have become faster, they also have consumed more energy. Until recently, this has caused problems only at the highest end, with large clusters or supercomputers requiring megawatts of power. But certain thresholds have been crossed recently that are now causing problems for everyone.

PCs and workstations: For a PC or workstation to be able to live in a convenient place, such as under a desk, it must plug into a regular wall socket, be reasonably quiet and not emit too much waste heat. Modern PCs and workstations are starting to push these limits, as they require more powerful internal fans, and their CPUs and other components generate more waste heat. A fast system can consume anywhere from 250 to 500 watts, all in a small space.

Clusters: Modern clusters have delivered enormous leaps in performance-per-square-foot and performance-per-dollar over recent years, and the rapid adoption of clusters throughout academia and industry is a testament to their success. Even at the high end, clusters have come to dominate in a very short time. According to the Top500, Linux clusters accounted for 352 of the 500 fastest systems in the world in their June 2006 ranking. Five years earlier, there were just 28 clusters in the Top500 systems (

Clusters have been so successful and have become so affordable that users are now hitting the limits of their infrastructure more often than their dollar limit. Machine rooms are filling with racks of equipment. The power supply to the machine room is maxed out. The heat generated by the equipment has increased over time, requiring more sophisticated and expensive cooling solutions. As a result, clusters are now becoming victims of their own success. If you haven't hit these problems yet, chances are you soon will.

At some point, rather than compute performance, compute density becomes the critical factor. The compute density metric factors performance, power consumption and size into the equation — all three are critical for your future cluster requirements. If a new server node is twice as fast as one of your current cluster server nodes, but uses twice as much power, then there is no improvement in compute density. You would gain no more computing performance within a given power constraint once your limit has been reached.

Supercomputers: Supercomputers have always been the scientific computing community's heavyweight number crunchers and have always pushed the infrastructure envelope. Often covering areas several times the size of a football field and consuming megawatts of power, supercomputers have often had to employ exotic power management techniques, such as water cooling. However, the power management issue is now prevalent across all computing platforms, not just supercomputers:
• Some high-end CPUs in PCs now are using heat pipes to keep them cool.
• One recent trend in clusters is to use water-cooled doors on each cabinet to more efficiently remove the waste heat from the data center.
• Even the supercomputer industry has had to adapt. IBM's BlueGene/L system had power consumption as a primary design goal. The solution was to design a system with many more processors than ever before, each one of which ran at a modest clock speed, and so used significantly lower power. Overall, this yielded a lower power solution, with corresponding gains in performance per watt for the system.
Fortunately, the industry is not standing still on the power consumption problem. Indeed, vendors now recognize that the key metric for the future is "performance per watt." Technologies are being adopted at all levels to address the fundamental issue of power consumption, and its limiting effect on increasing compute density.

Processor level: An important trend in microprocessors is the adoption of multiple cores within a single chip. Previously, the preserve of the high end, even laptops are starting to come equipped with dual-core CPUs. Processor vendors are moving to multi-core solutions largely because of the power consumption issue. Increasing CPU clock speed tends to make a processor use too much power to cool practically. A multi-core CPU can provide the same level of performance as a chip with a faster single core, but can do it at lower power. The important implication is that the application needs to have a sufficient level of parallelism to exploit multiple cores. This is usually not a problem, as core-level parallelism is essentially what is being exploited on clusters.

Modern CPUs are adopting a whole slew of different techniques to control their power consumption. Often individual components within a CPU can be powered independently of one another, allowing parts of the CPU to be powered down while others are still active. Other techniques allow some CPUs to dynamically adjust their clock speed and even the voltage of their power supplies, according to their dynamic workloads, and so consume less power when not at full load.

System level: If power consumption is becoming the limiting factor for compute density, then what is being done about this at the system level? Servers are becoming more dense, with two or even four multi-core CPUs being packed into a single 1U server just 1.75 inches high. Blades are pushing server density even higher, optimizing away components in order to pack more CPUs into less space. More focus is being placed on the thermal properties of the system, from the air flow inside the server, to the air conditioning of the machine room, even to the water-cooled rack doors mentioned earlier. System management software may even control "hot spots" - areas in the server where high levels of activity are causing increased thermal output. The management software could choose to move the processes causing the hot spot to processors further apart in order to prevent overloading the cooling system in that area.
Future trends
Multi-core CPUs will play a major and long-term part in increasing compute density. These will cause the levels of parallelism exposed to the application to rise dramatically. Processors with tens or even hundreds of cores are possible. However, software must be able to exploit this parallelism.

Application accelerators are another important technology trend. These are devices designed and optimized for specific tasks, and so can perform those tasks at much higher
Figure 2: A water-cooled system. Water cooling is one example of the exotic power management techniques that vendors have often had to employ.> 
speeds than a general-purpose CPU. Perhaps more importantly, accelerators are also much more power-efficient than a CPU for their specialized tasks. This last feature is what is likely to cause accelerators to become an important part of future systems.

Some accelerators are already ubiquitous today. The modern graphics card is one example; others include networking cards and encryption engines. Cray and SGI are exploring systems that can take advantage of field programmable gate arrays (FPGAs). These can be used to accelerate some compute-intensive kernels, particularly integer-based codes. However, other classes of accelerator are arriving that are directly aimed at the type of double precision floating point calculations found in scientific applications. One new class is targeting the standard math libraries that underpin many of these applications. For example, ClearSpeed's energy-efficient Advance board is designed to accelerate floating point BLAS, LAPACK, and FFT libraries, much as graphics cards accelerate the Direct3D and OpenGL graphics libraries.
Power consumption will be a primary factor in determining the performance of scientific computing solutions in the future. All systems will be affected by this issue: the desktop, the cluster and the supercomputer. Techniques to manage this issue are being adopted throughout the industry, from microprocessor design to system architecture, and even at the software level. As energy prices continue to rise, the running costs of a system may become even more important than its initial purchase price. Also, as installations grow to their maximum size, compute density will become more important than outright performance. This challenge is giving rise to new solutions, including accelerators that can significantly increase a system's performance per watt. One thing is for certain, the need for ever more computing performance is not going away. The computer industry has excelled at overcoming difficult challenges in the past. Increasing performance at historical rates while tackling power consumption will be the most difficult challenge it has faced yet.

Simon McIntosh-Smith is director of architecture and applications at ClearSpeed. He may be contacted at