Scientific Computing
   Popular Searches:
lims, visualization, chemistry, statistics, hpc
HPC



SITE SPONSORS
Home > HPC > HPC Balance and Common Sense

HPC Balance and Common Sense

Maintain ratios that work and improve on those that don’t

Rob Farber

With the advent of new processors and technologies, the current state of flux in the HPC community is a boon for scientists. Ever more capable supercomputers are being procured, and the performance increases promised by multi-core processors, increasing memory bandwidths, new communications fabrics and exotic computer
 
click to enlarge 

Figure 1: Kiviat diagram showing the distortion characteristic of out-of-balance systems
 
architectures can significantly reduce the time-to-solution and enable new science. The trade-off is that allocating time and money to acquire and move to a new hardware platform can be a substantial commitment, and one that can have painful consequences if the wrong platform is selected. So, if the proof of the pudding is in the tasting, how can we tell which new technologies can digest our computational workloads without leaving a sour taste in our mouths?

A common sense approach is to keep what works and improve on what doesn’t. In other words, measure the performance characteristics of your current system(s) and keep those characteristics that support your workloads and improve on any that might limit performance.

These measurements, in essence, define a system balance that is quantifiable and, with the right choice of benchmarks, provides some assurance that an existing computational workload will run well on a new computer. This concept is not new and has been used in the HPC world for quite a while. The HPC Challenge Web site (icl.cs.utk.edu/hpcc) generates Kiviat diagrams (similar to the radar plots in Excel) to compare systems based on their standard set of benchmarks. A well-balanced system looks symmetrical on these plots because they perform well on all tests. High performance balanced systems visually stand out because they will occupy the outermost rings. Out-of-balance systems are distorted as can be seen in figure 1.

So, the question then becomes what characteristics to measure, and how good they are at estimating performance on a new system. The answer, of course, depends on the workload your research places on the hardware and how it stresses the system.

Memory bandwidth is one of my favorite metrics to consider when evaluating the current crop of new multi-core processors. Why? Multi-core processors cannot provide an increase in floating point performance when the cores are stalled waiting for data due to insufficient bandwidth to memory. I will agree that wonderful performance numbers can result when everything fits into the on-chip caches. However, in reality, most applications require more than a few megabytes of memory. Without sufficient memory bandwidth, that spiffy new quad-core processor could take 4x longer to run your job, which can be very disappointing!

So, the relationship between floating point performance and memory bandwidth is an important aspect of a balanced system and should be one axis of our Kiviat diagram. We can then compare different systems by examining the ratio of floating point rate (flop/second) versus the memory bandwidth (bytes/second). The January 2003 Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure (www.nsf.gov/cise/sci/reports/atkins.pdf), known as the Atkins report, specifies some desirable metrics that are tied to floating point performance as listed below.
• at least 1 Byte of memory per FLOP/s
• memory bandwidth (Byte/s/FLOP/s) greater than or equal to 1
• internal network aggregate link bandwidth (Bytes/s/FLOP/s) greater than or equal to 0.2
• internal network bi-section bandwidth (Bytes/s/FLOP/s) greater than or equal to 0.1
• system sustained productive disk I/O bandwidth (Byte/s/FLOP/s) greater than or equal to 0.001

I find these metrics to be useful because they provide insight into how well different machines will be able to solve numerical (floating point) limited applications. Your workloads may be integer or double-precision limited, in which case these metrics can be easily adapted to fit your needs.

Other metrics, such as interconnection link bandwidth and latency, as well as storage bandwidth and capacity, can be important. The former metrics (interconnect) are significant for applications that utilize some communications method such as MPI (message passing interface) to tie multiple systems together into a computational cluster. The latter metrics (storage) are essential if file system capability affects your application runtime, say for loading large data sets or performing regular checkpoint operations.

The HPCC Web site is a good source for benchmarks and comparative data on different computational systems. For example, the STREAM benchmark on this site is a useful synthetic benchmark to measure the memory bandwidth of a system. Be aware that synthetic benchmarks are very good at stressing certain aspects of machine performance, but they do represent a narrow view into machine performance. That is why most benchmark evaluation suites include some sample production codes to complement synthetic benchmarks — just see if there is any unexpected performance change either good or bad. This can provide valuable insight into how well the processor and system component designers were able to exploit the complexities of real-world applications to increase performance in their systems. Conversely, it also can uncover issues that can adversely affect performance such as immature compilers and/or software drivers.

Finally, it is worth noting that numerical values presented in the Atkins Report are for desirable ratios based on their definition of a “representative” workload. These numbers are a reasonable starting point, but they need to be taken with a grain of salt, as they may or may not reflect the requirements of your research and computational workloads. For this reason, it is worthwhile to evaluate your workloads on existing systems and to specify ratios appropriate to your needs. As the title of this column implies, use common sense to maintain those balance ratios that work and improve on those that don’t.

Good luck and happy “balanced” computing!

Rob Farber is a senior research scientist at the William R. Wiley Environmental Molecular Sciences Laboratory’s Molecular Science Computing Facility at Pacific Northwest National Laboratory. He may be reached at editor@ScientificComputing.com.


Scientific Computing
Rockaway NJ 07866

Email Article | Contact the Editor | Printer Friendly

Post to Del.icio.us | Digg This | Post to Slashdot
 










Bioscience Technology Chromatography Techniques Drug Discovery & Development Laboratory Equipment Pharmaceutical Processing R&D Scientific Computing
Advantage Business Media © 2010 Advantage Business Media
Privacy Policy | Terms & Conditions | Advertise with Us