Irvine, Calif. — Faculty at the University of California, Irvine's Donald Bren School of Information and Computer Sciences have created a new technology that helps users to better store and make sense of "Big Data," maximizing the benefits of the roughly 2.5 quintillion bytes of online information generated every day.
Dubbed "AsterixDB," the system was developed by UCI professors Michael J. Carey and Chen Li and UCI project scientist Vinayak Borkar, in collaboration with Vassilis J. Tsotras, a professor at the University of California, Riverside. AsterixDB promises to be the most versatile among platforms aimed at managing Big Data; the software is now available for free download at http://asterix.ics.uci.edu.
The AsterixDB engine operates on a "shared nothing" architecture, in which each computer node is independent and self-sufficient. Its distinct advantages come by adding management of semi-structured data (i.e., data not organized in the traditional tabular form) and borrowing techniques from parallel databases that increase the speed and scale at which it can operate.
"We're providing a next-generation platform for storing, managing, coordinating and making use of Big Data," says Carey. He refers to the online output of blogs, tweets, transactions, status updates and other activities as "digital exhaust" in which immense value can be found. The challenges in filtering such information into usable forms come not only from the vastness of the data, but also from the speed at which it must be processed. Data comes at us faster all the time, and analysis must be performed more responsively — and in more complex terms — than most existing systems can handle. Then there is the added challenge of performing data analysis while taking into account the wildly different sources from which those blogs, tweets, transactions, updates, etc., appear.
The benefits of AsterixDB go far beyond allowing companies to understand their customers. Properly employed, Big Data can have immense value in fostering innovations to enhance the economy, create jobs and solve societal problems. It can aid public health agencies in predicting disease outbreaks, law enforcement in combating consumer fraud, and medical researchers in sequencing DNA ever more quickly, to name just a few examples.
"Big Data crosses a lot of domains, from government to health care to business," says Carey. "It's hard for us to imagine an area where AsterixDB can't contribute."
Government organizations and businesses already see the possibilities AsterixDB offers. In addition to sponsorship from the National Science Foundation and the state of California, the work of Carey, Li and their collaborators has garnered support from a number of major corporate sponsors, with more in the offing. Beyond funding, AsterixDB creators also are seeking partners who will utilize their platform and explore its potential in different domains where Big Data problems are awaiting solutions.
"We're putting AsterixDB out in an unrestricted open-source form," explains Carey. "Users can do whatever they want with it, and we can learn from what they do and further improve our platform based on their needs."