Advertisement
Data Mining
Subscribe to Data Mining

The Lead

Graphic of viruses attempting to "dock" on a microbial mat, using the tips of their tails. Courtesy of Blair Paul

Strange Viruses Discovered in Deep Ocean

March 23, 2015 11:51 am | by NSF | News | Comments

The intraterrestrials, they might be called. Strange creatures live in the deep sea, but few are odder than the viruses that inhabit deep ocean methane seeps and prey on single-celled microorganisms called archaea. The least understood of life's three primary domains, archaea thrive in the most extreme environments: near hot ocean rift vents, in acid mine drainage, in the saltiest of evaporation ponds and in petroleum deposits.

iPad App Game Uses Citizen Science to Track Endangered Species

March 18, 2015 11:12 am | by Aaron Mason, Wildsense, University of Surrey | News | Comments

A new app for the iPad could change the way wildlife is monitored. Wildsense, an initiative from...

Big Data Used to Understand Major Events

March 5, 2015 9:34 am | by University of Bristol | News | Comments

Research has for the first time analyzed over 130,000 online news articles to find out how the...

Billions of Words: Visualizing Natural Language

February 27, 2015 3:14 pm | by Benjamin Recchie, University of Chicago | News | Comments

Children don’t have to be told that “cat” and “cats” are variants of the same word — they pick...

View Sample

FREE Email Newsletter

Snow and icy conditions affect human decisions about transportation. These decisions can ripple through other infrastructure systems, causing widespread disruptions. Shown here are points of connectivity. Courtesy of Paul M. Torrens and Cheng Fu, Universi

Big Data Techniques More Accurately Model People in a Winter Wonderland

February 6, 2015 2:53 pm | by Cecile J. Gonzalez, NSF | News | Comments

For Paul Torrens, wintry weather is less about sledding and more about testing out models of human behavior. Torrens, a geographer at the University of Maryland, studies how snow and icy conditions affect human decisions about transportation. He also studies how these decisions ripple through other infrastructure systems.

Arabidopsis thaliana, a model flowering plant studied by biologists, has climate-sensitive genes whose expression was found to evolve. Courtesy of Penn State

Needle in a Haystack: Finding the Right Genes in Tens of Thousands

January 28, 2015 2:45 pm | by TACC | News | Comments

Scientists using supercomputers found genes sensitive to cold and drought in a plant help it survive climate change. The computational challenges were daunting, involving thousands of individual strains of the plant with hundreds of thousands of markers across the genome and testing for a dozen environmental variables. Their findings increase basic understanding of plant adaptation and can be applied to improve crops.

The results show that, by mining Facebook Likes, the computer model was able to predict a person's personality more accurately than most of their friends and family.

AI: Computers Know the Real You Better than Friends, Family

January 13, 2015 10:01 am | by University of Cambridge | News | Comments

Researchers have found that, based on enough Facebook Likes, computers can judge your personality traits better than your friends, family and even your partner. Using a new algorithm, researchers have calculated the average number of Likes artificial intelligence (AI) needs to draw personality inferences about you as accurately as your partner or parents.

Advertisement
In 1997, IBM’s Deep Blue computer beat chess wizard Garry Kasparov. This year, a computer system developed at the University of Wisconsin-Madison equaled or bested scientists at the complex task of extracting data from scientific publications and placing

Computer Equal To or Better Than Humans at Cataloging Science

December 2, 2014 2:53 pm | by David Tenenbaum, University of Wisconsin-Madison | News | Comments

In 1997, IBM’s Deep Blue computer beat chess wizard Garry Kasparov. This year, a computer system developed at the University of Wisconsin-Madison equaled or bested scientists at the complex task of extracting data from scientific publications and placing it in a database that catalogs the results of tens of thousands of individual studies.

LLNL researcher Monte LaBute was part of a Lab team that recently published an article in PLOS ONE detailing the use of supercomputers to link proteins to drug side effects. Courtesy of Julie Russell/LLNL

Supercomputers Link Proteins to Adverse Drug Reactions

October 21, 2014 10:40 am | by Kenneth K Ma, Lawrence Livermore National Laboratory | News | Comments

The drug creation process often misses many side effects that kill at least 100,000 patients a year. LLNL researchers have discovered a high-tech method of using supercomputers to identify proteins that cause medications to have certain adverse drug reactions, using high-performance computers to process proteins and drug compounds in an algorithm that produces reliable data outside of a laboratory setting for drug discovery.

NeuroSolutions Infinity

NeuroSolutions Infinity

September 11, 2014 3:58 pm | Neurodimension, Inc. | Product Releases | Comments

NeuroSolutions Infinity predictive data analytics and modeling software is designed to streamline data mining by automatically taking care of the entire data modeling process. It includes everything from accessing, cleaning and arranging data, to intelligently trying potential inputs, preprocessing and neural network architectures, to selecting the best neural network and verifying the results.

Robo Brain — a large-scale computational system that learns from publicly available Internet resources — is currently downloading and processing about 1 billion images, 120,000 YouTube videos, and 100 million how-to documents and appliance manuals. The in

Robo Brain Teaches Robots Everything from the Internet

August 28, 2014 11:52 am | by Cornell University | News | Comments

Robo Brain — a large-scale computational system that learns from publicly available Internet resources — is currently downloading and processing about 1 billion images, 120,000 YouTube videos, and 100 million how-to documents and appliance manuals. The information is being translated and stored in a robot-friendly format that robots will be able to draw on when they need it.

North Korea (the dark area) and South Korea at night. Courtesy of NASA

Citizen Science: Images of Earth at Night Crowdsourced for Science

August 19, 2014 2:59 pm | by NASA | News | Comments

A wealth of images of Earth at night taken by astronauts on the International Space Station (ISS) could help save energy, contribute to better human health and safety and improve our understanding of atmospheric chemistry. But, scientists need your help to make that happen.

Advertisement
University of Wisconsin Researchers utilized HPC resources in combination with multiple advanced forms of protein structure prediction algorithms and deep sequence data mining to construct a highly plausible capsid model for Rhinovirus-C (~600,000 atoms).

HPC Innovation Excellence Award: University of Wisconsin-Madison

June 23, 2014 4:33 pm | Award Winners

University of Wisconsin Researchers utilized HPC resources in combination with multiple advanced forms of protein structure prediction algorithms and deep sequence data mining to construct a highly plausible capsid model for Rhinovirus-C (~600,000 atoms). The simulation model helps researchers in explaining why the existing pharmaceuticals don’t work on this virus.

Making changes within a complex software system is often error-prone – even the smallest mistake can endanger the entire system.

Data Mining Software Version Histories

June 23, 2014 6:30 am | by AlphaGalileo | News | Comments

Making changes within a complex software system is often error-prone – even the smallest mistake can endanger the entire system. Ten years ago, computer scientists from Saarbrücken around Professor Andreas Zeller developed a technique that automatically issues suggestions on how to manage changes...

Projects that allow the general public to collaborate with scientists are becoming useful sources of knowledge on a large scale. Online databases such as iNaturalist.org and DiscoverLife.org — based at UGA — rely on amateur observers to contribute photo

New Data Collection, Analysis and Sharing Tools Help Protect Threatened Species

May 29, 2014 9:41 pm | by Science Newsline | News | Comments

Athens, Ga. – New tools to collect and share information could help stem the loss of the world's threatened species, according to a paper published today in the journal Science. The study—by an international team of scientists that included John L. Gittleman, dean of the University of Georgia Odum...

Huynh Phung Huynh, Scientist and Capability Group Manager, A*STAR Institute of High Performance Computing

Huynh Phung Huynh

April 16, 2014 8:45 am | Biographies

Huynh Phung Huynh's research interests include high performance computing (HPC): compiler optimization for GPU, many cores and other accelerators; Parallel computing: framework for parallel programming or scheduling; and HPC for data mining and machine learning algorithms.

Data Mining Disaster

March 28, 2014 4:33 pm | News | Comments

Computer technology that can mine data from social media during times of natural or other disaster could provide invaluable insights for rescue workers and decision makers. Advances in information technology have had a profound impact on disaster management.

Advertisement

Mathematics for Safer Medicine: Calculating Uncertainties within Technical Systems

January 7, 2014 6:20 am | by Heidelberg Institute for Theoretical Studies | News | Comments

The new HITS research group “Data Mining and Uncertainty Quantification” analyzes large amounts of data and calculates uncertainties in technical systems. With Prof. Vincent Heuveline as their group leader, the group of mathematicians and computer scientists especially focuses on increasing the security of technology in operating rooms.

Text Mining: The Next Data Frontier

January 6, 2014 2:04 pm | by Mark A. Anawis | Blogs | Comments

Josiah Stamp said: “The individual source of the statistics may easily be the weakest link.” Nowhere is this more true than in the new field of text mining, given the wide variety of textual information. By some estimates, 80 percent of the information available occurs as free-form text which, prior to the development of text mining, needed to be read in its entirety in order for information to be obtained from it.

'Approximate Computing' Improves Efficiency, Saves Energy

December 18, 2013 4:03 pm | by Emil Venere, Purdue University | News | Comments

Researchers are developing computers capable of "approximate computing" to perform calculations good enough for certain tasks that don't require perfect accuracy, potentially doubling efficiency and reducing energy consumption.       

Meet HPC Innovator Taghrid Samak

December 3, 2013 4:03 pm | by Jon Bashor, Berkeley Lab Computational Research Division | Articles | Comments

Everything leading up to the actual coding, figuring out how to make it work, is what Samak enjoys most. One of the problems she is working on with the Department of Energy’s Joint Genome Institute (JGI) is a data mining method to automatically identify errors in genome assembly, replacing the current approach of manually inspecting the assembly.

Harnessing Collective Wisdom from Social Networks

November 7, 2013 12:49 pm | by National Science Foundation | News | Comments

In his 1937 book, "Think and Grow Rich," author Napoleon Hill identified 13 steps to success, one of which was the power of the mastermind. "No two minds ever come together without thereby creating a third, invisible, intangible force, which may be likened to a third mind," Hill wrote.

Hardware for Big Data, Graphs and Large-scale Computation

September 9, 2013 9:58 am | by Rob Farber | Articles | Comments

Recent announcements by Intel and NVIDIA indicate that massively parallel computing with GPUs and Intel Xeon Phi will no longer require passing data via the PCIe bus. The bad news is that these standalone devices are still in the design phase and are not yet available for purchase.

IBM Narrows Big Data Skills Gap, Partnering with More than 1,000 Global Universities

August 15, 2013 10:49 am | by IBM | News | Comments

IBM announced on August 24, 2013, that it has added nine new academic collaborations to its more than 1,000 partnerships with universities across the globe, focusing on Big Data and analytics - all of which are designed to prepare students for the 4.4 million jobs that will be created worldwide to support Big Data by 2015. The company also announced more than $100,000 in awards for Big Data curricula.

HPC Architectures Begin Long-Term Shift Away from Compute Centrism

August 15, 2013 8:43 am | by Steve Conway, IDC | Articles | Comments

The HPC market is entering a kind of perfect storm. For years, HPC architectures have tilted farther and farther away from optimal balance between processor speed, memory access and I/O speed. As successive generations of HPC systems have upped peak processor performance without corresponding advances in per-core memory capacity and speed, the systems have become increasingly compute centric

StatSoft Receives Top Ratings in KDnuggets Poll

June 11, 2013 2:36 pm | by StatSoft | News | Comments

The 14th annual KDnuggets Software Poll, conducted in May 2013, attracted record participation of 1,880 internet voters, more than doubling the previous year's numbers. KDnuggets.com is a data mining portal and newsletter publisher for the data mining community with more than 12,000 subscribers.

New Algorithm Cluster Improves Health Record Data Mining

May 14, 2013 9:18 pm | by New Jersey Institute of Technology | News | Comments

The time may be fast approaching for researchers to take better advantage of the vast amount of valuable patient information available from U.S. electronic health records. Lian Duan, an NJIT computer scientist with an expertise in data mining, has done just that with the recent publication of "Adverse Drug Effect Detection," IEEE Journal of Biomedical and Health Informatics (March, 2013).

Pathway Studio for Web

April 5, 2013 10:38 am | Elsevier, Inc. | Product Releases | Comments

Pathway Studio, a research solution for biologists, is now available in a Web-based version. The integrated data mining and visualization software features comprehensive knowledge bases produced by applying MedScan, Elsevier’s proprietary text-mining technology, to a large corpus of biological literature.

NSF funded Superhero Supercomputer Helps Battle Autism

March 26, 2013 7:45 pm | News | Comments

When it officially came online at the San Diego Supercomputer Center (SDSC) in early January 2012, Gordon was instantly impressive. In one demonstration, it sustained more than 35 million input/output operations per second--then, a world record.

X
You may login with either your assigned username or your e-mail address.
The password field is case sensitive.
Loading