On the Cutting Edge
Fostering new and emerging HPC 

In November 2010, the Big Easy plays host to discoverers of the future at SC10. Taking place from November 13 to 19, 2010 in New Orleans, LA, the international conference offers a complete technical education program and exhibition to showcase the many ways high performance computing, networking, storage and analysis lead to advances in scientific discovery, research, education and commerce. Sponsored by IEEE Computer Society and Association for Computing Machinery (ACM), the conference’s 23rd meeting is anticipating more than 11,000 attendees from industry, academia and government. SC10 includes a globally attended technical program, workshops, tutorials, a world-class exhibit area, demonstrations and opportunities for hands-on learning.

Each year, the SC Technical Program highlights key thrust areas that are integrated throughout the various components of the program to showcase the SC community’s impact on these new and emerging fields. For SC10, the technological thrust areas are climate simulation, heterogeneous computing and data-intensive computing.


Addressing Climate Change Uncertainties
Innovation and imagination are needed to mitigate bottlenecks
William Sawyer, HPC Application Analyst, Swiss National Supercomputing Centre, and SC10 Climate Simulation Thrust Area Co-chair 

The climate simulation area will highlight the importance of HPC-based research to help scientists understand global warming, climate change and other environmental processes. Climate change represents a great challenge, not only in terms of the potential consequences for humankind, but also in terms of our proper evaluation of the risks involved. The subject is controversial and the debate lively; although most scientists agree that the current observed global warming is induced to a large part by human activity, many questions are still open. The magnitude, feedbacks and manifestations of climate change are subject to deep uncertainty. If we can assess the impact despite these uncertainties, we have the chance to react appropriately.

Central to reducing the uncertainty are the projections generated by climate system models. Based on their cousins, the numerical weather prediction models, climate models take long-term interactions of ocean, land and atmosphere into account, as well as human and natural influences (e.g. the increase in greenhouse gases). The central limiting factor to the efficacy of a climate model is its resolution, which is ultimately determined by available computing resources. At higher resolution, more physical phenomena are explicitly simulated, increasing the information content of the result.

Climate modeling is, thus, one of the most compute-intensive areas of research, and one where new and emerging high-performance computing technologies will be decisive in answering important climate questions, such as:

  • What changes are expected on a global scale?
  • How will climate change affect communities and individuals on a regional scale?
  • Who will suffer?
  • Who will benefit?

To encourage the nexus between the HPC and climate science communities, SC10 will host a climate thrust day, bringing together experts from the stakeholder communities to discuss the state-of-the-art in climate modeling. Central to the discussion will be new numerical methods and computational meshes to improve the quality and parallel efficiency of the models, as well as parallel programming paradigms that allow models to make use of new technologies, such as general-purpose graphical processing units (GPGPUs) or application-specific processors. New data movement strategies also are critical: even while Moore’s Law continues to hold for silicon real estate, the bandwidth to memory and file storage lags behind. The I/O requirements of most climate models are, thus, unsustainable as they progress to ever-higher resolution. Innovation and imagination are needed to mitigate these performance bottlenecks.

Computer vendors are also key players. Current supercomputers are general-purpose and do not emphasize application-specific performance. Design decisions and optimizations based on generic benchmarks offer little benefit to climate modeling: the models do not make extensive use of numerical kernels that achieve near-peak performance on the underlying hardware and, thus, models usually execute with lamentably low efficiency. Climate scientists must formulate the performance metrics and identify emerging technologies that are central to their work, and communicate this information to vendors for the support of current architectures and the design of new ones.

The climate thrust area will attempt to foster an information exchange between all stakeholders through papers, panels, masterworks and plenary speakers. We look forward to a lively and productive interaction between the communities.

Taming the Complexities of Heterogeneous Computing
The battle lies in managing resources and orchestrating a plethora of interdependent components 
Robert D. Adolf, Pacific Northwest National Laboratory Research Scientist, and SC10 Heterogeneous Computing Thrust Area Co-chair and Patrick S. McCormick, Los Alamos National Laboratory, and SC10 Heterogeneous Computing Thrust Area  Co-chair 

The heterogeneous computing thrust area will examine the technological and research advances in software that scientists will need in order to use accelerator-based computing, which is now occurring on large-scale machines and could propel supercomputing to the exascale level. With computation playing such a fundamental role in so many disciplines, it is no surprise that different applications pull computer architectures in many different directions at once. For many years, advances in microprocessor design and manufacturing have given chip makers the flexibility to offer something for everyone but, as clock speeds plateau, returns from superscalar features diminish, and concerns over power consumption take center stage, the way we use transistors is beginning to diverge.

The heyday of a one-size-fits-all architecture is waning, and the high-performance community is rediscovering the advantages of specialization. Tailored on-chip functional units, mixed-processor and accelerated systems, and reconfigurable hardware are all presenting themselves as attractive alternatives to reach the continued performance improvements to which the computing community has become accustomed. Already, one does not have to look far to see the benefits reaped by early adopters: three of the 10 fastest computers in the world are composed of heterogeneous compute nodes, and even larger machines loom on the horizon.What we gain in performance and efficiency, however, comes at the price of complexity. Application programmers, system software developers, system administrators and end users are all saddled with new challenges derived not only from using novel architectural features, but also from the interaction between disparate hardware models. For researchers looking to build and use the next generation of supercomputers, the battle lies not in accumulating more raw processing power, but in managing resources and orchestrating a plethora of interdependent components. Given the sheer scale of the problems at hand, it seems unlikely that any unilateral attempt to solve the problem from within a single layer will succeed. Taming the complexity of heterogeneous computers will require a broad, interdisciplinary effort. Applications, programming environments, libraries, system software, processors, platforms and facilities all manifest unique, unsolved problems in hybrid environments. However, without a committed push to communicate, collaborate and cooperate, the high-performance landscape could easily end up fragmented and littered with naive solutions that solve one problem only by foisting insurmountable challenges onto other aspects of the system. Supercomputing has consistently been a nexus of cross-domain discourse, attracting luminaries from all corners of the high-performance community into one forum. This year’s heterogeneous computing thrust is a timely opportunity to produce exactly the kind of focused, collective initiative this domain demands.

Effectively Sharing Large-scale Data
Essential questions must be answered
Michelle Butler, Technical Program Manager for Storage Enabling Technologies, National Center for Supercomputing Applications, and SC10 Data-intensive Computing Thrust Area co-chair 

The third SC10 technological thrust area, data-intensive computing, will spotlight innovative research and solutions for managing data across distributed HPC systems, especially hardware and software requirements for effective data transfer. As computational resources, sensor networks and other large-scale instruments and experiments grow, the data generated from these sources also is growing. Other communities — such as libraries, industry and social sciences — suddenly find themselves in possession of vast quantities of electronic data as well. To capture this data, new ecosystems or cyberenvironments are emerging that cross many technological and social boundaries in both scientific and non-scientific fields. Significant technical challenges remain to efficiently and effectively store, serve and share the data. Perhaps the greater challenges are the significant policy issues concerned with the access, organization, curation and lifecycle of the data.

The data equation is a difficult one that seems to have changed from being an addition problem to a multiplication problem. In the last 20 years, data management has changed from simple storage disk drives with tape access to databases, multiple types of disk drives, complicated software with hierarchical storage systems, information lifecycle management, collection management, and robotic tape subsystems. In the past, applications regenerated their data files instead of retrieving them, because it was faster. Today, that is not an effective way to manage compute resources. Large compute machine procurements dedicate 10 to 15 percent of the overall budget to extreme storage environments specifically for that compute environment.

Significant investments in additional centerwide high-speed data management storage environments also have grown essential over the years. Large dedicated data-intensive machines with large, extreme storage systems for all types of data techniques are beginning to populate the compute machine landscape, too.

The management of data has become the number one issue in supporting supercomputing, publishing and other non-scientific fields, as our ability to generate, accumulate and curate data has enormously outstripped our capability for secure storage. We risk negating the recent improvements in computing and other technologies by having to discard its fruits. Or worse, we may lose the confidence of the general scientific and other communities by presenting conclusions without having the original data available for those who would like to challenge assumptions implicit in the data reduction.

The data must be retained, but there are essential questions that have to be answered, including:

  • Who should keep the data and pay the costs?
  • How long should it be kept and who should have access?
  • Who owns the metadata?

The data for science is more than flat files. Science today overwhelmingly relies on storage for different levels of services. Service avenues for data can be metadata management for system, user, application or access paradigms; security environments both physical and logical to not only protect the data, but to also identify who should have access; reliability, availability and serviceability of the data itself; and lifecycle management for data and policy-driven storage decisions.

There are numerous drivers for data technologies. The scientists for current projects, the curators for past and future data sets, and even the publishing industries use data environments. There are many science drivers that stress not only data management requirements but data sharing being a key component.

One of the science drivers for the data thrust area for SC10 is the Large Synoptic Survey Telescope (LSST) project. The LSST system will produce a six-band (0.3-1.1 micron) wide-field deep astronomical survey of over 20,000 square degrees of the southern sky using an 8.4-meter ground-based telescope, transferring >15 terabytes (TB) of data nightly with an additional generated pipeline processing producing >75 TB total nightly. The extreme data requirements of LSST means the data management team must: reliably process unprecedented data volumes, ensure consistent data quality without manual intervention, accommodate both scientific and computing technology evolution over at least a decade, and serve the LSST data products to a diverse community of users located across multiple continents.

Another large data sharing science group is the CAPS project. CAPS’s mission is to develop and demonstrate techniques for the numerical analysis and prediction of high-impact local weather and environmental conditions, with emphasis on the assimilation of observations from Doppler radars and other advanced in-situ and remote sensing systems. CAPS conducts a broad-based program of basic and applied storm-scale research, and its award-winning Advanced Regional Prediction System (ARPS) is used worldwide. CAPS is producing realtime 400-m-resolution low-level wind analyses that are updated every five minutes for Collaborative Adaptive Sensing of the Atmosphere (CASA) Spring Experiment. Available CASA and other sources of data are going into the analyses. Furthermore, up to two-hour-long Very-Short-Range NWP forecasts at a one-kilometer resolution are produced every 10 minutes when active weather exists within the CASA Oklahoma network. The environment is producing large amounts of data that need to be shared across the country.

A third group with large data sharing requirements is the Southern California Earthquake Center (SCEC) project. Headquartered at the University of Southern California, SCEC was founded in 1991 with a mission to gather data on earthquakes in Southern California and elsewhere, integrate information into a comprehensive and physics-based understanding of earthquake phenomena, and communicate understanding to society at large as useful knowledge for reducing earthquake risk. The SCEC project is a large outstanding community of over 600 scientists from 16 core institutions, 47 participating institutions, and elsewhere. SCEC also partners with a large number of other research and education/outreach organizations in many disciplines. To support this community, SCEC engages in information technology research that will revolutionize our methods of doing collaborative research and distributing research products on-line.

An emerging requirement is for curator for not only preserving the past experiments, but the future ones also. The need to retain data for future scientific research presents challenges to funding agencies and service organizations. These institutions have traditionally only funded computational research in short-term timeframes: once the data has been generated, there are few plans for how to store it for preservation and access. This means that the intellectual capital being produced is not being treated with the same diligence as traditional scientific output.

The scientific publishing industry is also at a crossroads for data requirements and environments. As the majority of journal articles now produced have accompanying datasets, publishers must decide how they want to handle the digital artifacts underlying the science they are presenting. Do they become caretakers of the data, or merely provide identifiers to externally stored resources? If they are not keeping the data themselves, how can they be sure that it is being preserved in an acceptable manner?

Just to name a few, science teams of astronomy, earthquake research and weather, along with the curators and those publishing the results of science, all have different requirements. However, all are drivers for data environments of some type.

At Supercomputing 2010, funding agents, vendors, servers, producers and consumers of data will congregate to discuss the technical and social issues related to the mountain of data. There are technical papers, panels, tutorials, workshops and birds-of-a-feather sessions dedicated to sharing large-scale data.