Institutions Announce Collaboration toward Sharing Neuroscience Data
OXNARD, CA — The Allen Institute for Brain Science, California Institute of Technology, New York University School of Medicine, the Howard Hughes Medical Institute (HHMI) and the University of California, Berkeley (UC Berkeley) are collaborating on a project aimed at making databases about the brain more useable and accessible for neuroscientists — a step seen as critical to accelerating the pace of discoveries about the brain in health and disease. With funding from GE, The Kavli Foundation, the Allen Institute for Brain Science, the HHMI, and the International Neuroinformatics Coordinating Facility (INCF), the year-long project will focus on standardizing a subset of neuroscience data, making this research simpler for scientists to share.
This is the first collaboration launched by "Neurodata Without Borders," a broader initiative with the goal of standardizing neuroscience data on an international scale, making it more easily sharable by researchers worldwide. This first project is called "Neurodata Without Borders: Neurophysiology."
Unlike image file formats such as jpeg or tiff, that store digital information when we take a photo with our mobile phones and allow us to share that photo with anyone with a computer, no such data standard exists in neuroscience. However, developing such a standard, or unified data format, would enhance the ability of brain researchers worldwide to share and combine their research results. This would not only drive progress in neuroscience but also encourage the validation of existing results and create vital new collaborations with other fields.
"Neuroscientists aren't limited by memory storage anymore; we're limited by our ingenuity, the availability of data and our ability to talk to each other," says Christof Koch, Chief Scientific Officer at the Allen Institute. "This pilot program is an effort to help us speak the same language."
Today, researchers can simultaneously record the electrical or optical activity of a thousand neurons in a mouse's brain while the animal is navigating a maze, for example, and that number may soon be in the millions. These recent, rapid technical advances mean that neuroscientists are generating data that is quantitatively and qualitatively different than before. But the languages, or formats, they use to capture those data (as well as the software tools they use to access and analyze them) vary from laboratory to laboratory — and sometimes even within a laboratory. This lack of uniformity makes it challenging to share and integrate experimental data — the raw material of science — and to mine and extract the most value from them.
The need for a common data format in neuroscience is made more urgent by the rise of large-scale collaborative projects, such as the Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative in the United States.
"These new initiatives are going to produce masses of data, but if it isn't interchangeable and comparable, it's just not going to be useful," says Koch.
On a practical level, scientific publishers and granting agencies, such as the U.S. National Institutes of Health, are moving toward mandating data sharing as a requirement for funding.
"This is following on other efforts at openness in science," says Markus Meister, a professor of biology at Caltech whose research group is supplying experimental data to the project. "The idea is that the material or resources that were developed with government funding or published in a journal have to be made available. But for neurophysiology data, there is no organized mechanism for doing that at the moment."
The initial one-year program focuses on a subset of neuroscience data: cell-based neurophysiology data, which is sought-after by theorists who are building models of how the brain works. The partners will work with software developers and vendors to establish an open format that can store electrical and optical recordings of neural activity, and, importantly, the conditions under which an experiment was performed, such as how brain activity was recorded, how the animal was behaving at certain time points, and its species, sex and age. These "metadata" are often lost and yet without them the research results are meaningless.
This "metadata problem" poses an enormous challenge, says Friedrich Sommer, a theoretical neuroscientist at UC Berkeley who oversees an existing repository, CRCNS.org, where the neurophysiology datasets of Neurodata Without Borders will be stored and shared. UC Berkeley is coordinating Neurodata Without Borders with staff from the Allen Institute.
As Sommer explains, once a data format has been selected and extended, the neurophysiology datasets will be translated into the new common language and shared with the broader neuroscience community through the repository. Lastly, "application programming interfaces" (APIs) will be developed to allow researchers to use the common format for their own data with ease.
To get to that point, Neurodata Without Borders is calling on the neuroscience community to get involved. "We want to solicit the best ideas for the data format, so we are inviting researchers to look at the datasets which are now shared in their current format at CRCNS.org. Our hope is to engage the community to contribute ideas or propose their own data format for consideration," says Sommer.
The most promising approaches to a common data format will be discussed, tested and extended at Neurodata Without Borders Hackathons, the first of which will be held in late November, to drive the rapid development of innovative software tools.
"The project has an aggressive timeline, but in a year's time, the goal is to come up with a standard for neurophysiology data that we can agree on. We may not get it 100 percent right for 100 percent of researchers, but we'll make a very good attempt," says neuroscientist Karel Svoboda, group leader at HHMI's Janelia Research Campus and a data-provider to Neurodata Without Borders. "Then, by buying into the data format ourselves — by explicitly moving our data into the format and making them available, we'll set an example of how it could be done, and hopefully have others in the neuroscience community follow in our footsteps."
There have been smaller efforts to develop a common language for neuroscience data in the past but they have fallen short of meeting the goals of the new project.
"This new effort is not very different in spirit from past ones, but it's at a much larger scale," says NYU School of Medicine's Gyorgy Buzsaki, a pioneer of data-sharing in neuroscience and another data-provider. "We're trying to make Neurodata Without Borders as attractive as possible by improving the quality of the datasets and how they are documented, and by thinking hard about how best to help researchers navigate around in them. Ultimately, if researchers can find and access the data they're interested in in a few hours, they will choose it. But that is a very difficult thing to do."
"With the emergence of large-scale brain initiatives around the world, data reuse and sharing becomes more important than ever. This project will facilitate neuroscience collaboration at a global scale," says Sean Hill, Scientific Director, INCF.
"Standardizing a subset of neuroscience data is vital to accelerate the pace of research and innovation in brain health. We are proud to be working with these best-in-class organizations on such an important and needed study," says Robert Wells, executive director, healthymagination strategy, GE.
Miyoung Chun, Executive Vice President of Science Programs, The Kavli Foundation, agrees with these assessments. "In neuroscience, as in many scientific fields, there are massive amounts of 'Big Data' but no coherent way to retrieve and use this information. Our hope is Neurodata Without Borders: Neurophysiology is a major step toward changing this and speeding breakthroughs in brain science."
Data: The first datasets from the participating laboratories are already publicly available at CRCNS.org, a repository of neuroscience data hosted by the Redwood Center for Theoretical Neuroscience and the Helen Wills Neuroscience Institute at UC Berkeley.
Hackathon: The first Neurodata Without Borders Hackathon will be held at Janelia Farm, in Ashburn, Virginia, from November 20 – 22, 2014.
Developers who would like to participate should contact: Fritz Sommer, firstname.lastname@example.org
Researcher Queries: Researchers and research institutions who wish to make queries about Neurodata Without Borders should contact: Chris Martin, email@example.com
ABOUT NEURODATA WITHOUT BORDERS. Neurodata Without Borders (NWB) is an initiative aimed at standardizing neuroscience data on an international scale. Established in response to the U.S. Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative, the goal is to break the geographic, institutional barriers, technological and policy barriers that impede the flow of neuroscience data to the broad scientific community. This is seen as key to accelerating the pace and success of brain research worldwide.
ABOUT THE INSTITUTIONS
Allen Institute for Brain Science. The Allen Institute for Brain Science is an independent, 501(c)(3) nonprofit medical research organization dedicated to accelerating the understanding of how the human brain works in health and disease. Using a big science approach, the Allen Institute generates useful public resources used by researchers and organizations around the globe, drives technological and analytical advances, and discovers fundamental brain properties through integration of experiments, modeling and theory. Launched in 2003 with a seed contribution from founder and philanthropist Paul G. Allen, the Allen Institute is supported by a diversity of government, foundation and private funds to enable its projects. Given the Institute's achievements, Mr. Allen committed an additional $300 million in 2012 for the first four years of a ten-year plan to further propel and expand the Institute's scientific programs, bringing his total commitment to date to $500 million. The Allen Institute's data and tools are publicly available online at www.brain-map.org.
California Institute of Technology. Caltech is a research and education institution focused on science and engineering, where faculty and students pursue new knowledge about our world and search for the kinds of bold and innovative advances that will transform our future. Caltech's neuroscience research spans a vast range of subjects and the integration of approaches from many disciplines.
GE. GE (NYSE: GE) works on things that matter: finding solutions in energy, health and home, transportation and finance. Building, powering, moving and helping to cure the world. Not just imagining. Doing. GE works.
Howard Hughes Medical Institute. The Howard Hughes Medical Institute plays a powerful role in advancing scientific research and education in the United States. Its scientists, located across the country and around the world, have made important discoveries that advance both human health and our fundamental understanding of biology. The Institute also aims to transform science education into a creative, interdisciplinary endeavor that reflects the excitement of real research.
International Neuroinformatics Coordinating Facility. The International Neuroinformatics Coordinating Facility (INCF) is an international organization launched in 2005, following a proposal from the Global Science Forum of the Organisation for Economic Co-operation and Development (OECD) to establish international coordination and collaborative informatics infrastructure for neuroscience – and currently has 17 member countries across North America, Europe, Australia and Asia. INCF establishes and operates scientific programs to develop standards for neuroscience data sharing, analysis, modeling and simulation while coordinating an informatics infrastructure designed to enable the integration of neuroscience data and knowledge worldwide and catalyze insights into brain function in health and disease.
The Kavli Foundation. The Kavli Foundation advances science for the benefit of humanity, promotes public understanding of scientific research, and supports scientists and their work. Based in Southern California, the Foundation's mission is implemented through an international program of research institutes in the fields of astrophysics, nanoscience, neuroscience and theoretical physics, and through the support of conferences, symposia, endowed professorships and other activities. The Foundation is also a founding partner of the biennial Kavli Prizes, which recognize scientists for their seminal advances in three research areas: astrophysics, nanoscience and neuroscience. For more information, visit www.kavlifoundation.org.
New York University School of Medicine. NYU Langone Medical Center, a patient-centered, integrated academic medical center, is one of the nation's centers for excellence in clinical care, biomedical research, and medical education. In 2011, NYU Langone Medical Center established a new, state-of-the-art Neuroscience Institute to leverage NYU's excellence in both basic science and clinical medicine. The Neuroscience Institute will play a unifying role to enhance communication and collaboration among clinical, translational, and basic neuroscientists.
University of California, Berkeley. UC Berkeley is a public university with a mission to excel in teaching, research and public service. This longstanding mission has led to the university's distinguished record of Nobel-level scholarship, constant innovation, a concern for the betterment of our world, and consistently high rankings of its schools and departments.