Advertisement
How to improve data management in the supercomputers of the future. Source: UC3M

In recent decades, many scientific discoveries have depended on the analysis of an enormous volume of data, which is done essentially through computational simulations performed on a large scale in supercomputers. This type of machine is used for the study of climate models, the development of new materials, research into the origin of the universe, the study of the human genome and new applications in bioengineering.

At present, as an ever-increasing amount of information is collected and stored, scientific data management faces a problem: the software that manages the latest generation of supercomputers was not designed for the scalability requirements that are expected in coming years. In fact, in less than a decade, these infrastructures are going to be two orders of magnitude faster than current supercomputers.

"Today, these applications are encountering big problems of performance and scalability due to the exponential increase of data as a result of better instruments, the growing ubiquity of sensors and greater connectivity between devices," explained professor Florin Isaila, from the group ARCOS in the UC3M Department of Computer Science. "These days, a radical redesigning of the computational infrastructures and management software is necessary to adapt them to the new model of science, which is based on the massive processing of data."

The objective of the CLARISSE project, whose acronym stands for "Cross-Layer Abstractions and Run-time for I/O Software Stack of Extreme-scale systems," is just that: to increase the performance, scalability, programmability and robustness of the data management of scientific applications with the goal of offering support to the design of next-generation supercomputers. To this end, the program, coordinated by UC3M, is funded by the European Union's Seventh Framework Programme (FP7/2007-2013, under the agreement of grant number 328582), along with the collaboration of the Argonne National Laboratory, one of the world leaders in the research and development of systems software for large-scale supercomputers.

Historically, data management software has been developed in layers with little coordination in the global management of resources. "Nowadays, this lack of coordination is one of the biggest obstacles to increasing the scalability of current systems. In this regard, in CLARISSE, we research solutions to these problems through the design of new mechanisms for coordinating the data management of the different layers," said professor Isaila.

Jesús Carretero, the project's main researcher, UC3M full professor and head of ARCOS, explained, "At present, ARCOS is actively involved in several initiatives around the world to remodel the management software of future supercomputers, including the coordination of the CLARISSE project and the research collaboration network NESUS. The resulting synergies of these efforts are going to contribute substantially to accelerating scientific discoveries in the coming decades."

Advertisement
Advertisement