New Data Archive Aims to Amplify Impact of Ecosystem Research
Three-year Project Being Led by CRD's Deb Agarwal
July 21, 2017
Contact: Kristine Wong, 510-486-5202, firstname.lastname@example.org
As environmental scientists move towards understanding earth systems at greater resolution than ever before, it’s critical that they have access to needed data sets. Yet much of these data are not archived, publicly available, or collected in a standardized format, due to the multiple challenges of coordinating efforts across independent research groups and institutions worldwide.
Now researchers at Berkeley Lab are taking action to address these challenges. Thanks to $3.6 million in funding from the U.S. Department of Energy (DOE)’s Office of Science, the Lab’s Computing Sciences and Earth & Environmental Sciences Area (EESA) are partnering on a three-year project to develop an archive that will serve as a repository for hundreds of DOE-funded research projects under the agency’s Environmental System Science (ESS) umbrella. The ESS domain includes both large-scale and smaller studies of Subsurface Biogeochemical Research and Terrestrial Ecosystem Science around the world.
“Our basic mission is to enable all of DOE’s ESS projects to archive their data with us so that it’s available, and won’t get lost,” said Deborah Agarwal, a senior scientist at Computing Sciences who is leading the effort. “Just as important is to make the data available to the public, as well as to DOE researchers.”
Dubbed ESS-DIVE (Environmental System – Science Data Infrastructure for a Virtual Ecosystem), the Lab-hosted archive will make a significant difference for researchers and the public, says Margaret Torn, EESA senior scientist. Torn leads EESA’s Biosphere-Atmosphere Interactions program domain, which encompasses large ESS projects such as AmeriFlux, Next Generation Ecosystem Experiment (NGEE)-Arctic, and NGEE-Tropics.
In addition to providing an archive for her team’s data, Torn says that ESS-DIVE will allow scientists studying similar topics to know that other data exist. And by enabling the community to establish protocols and standards for the archived data—such as using the same variable names and units—it will enable scientists to integrate data from across teams/projects for broader analyses.
“People who aren’t researchers will also benefit from these data,” Torn said, “such as water utilities, farmers, and stewards of environmental remediation.”
The ESS-DIVE team will set up user capabilities in the archive such as advanced data search and data visualization. The team also plans to conduct a user needs assessment in order to ensure a quality user experience.
“The preservation and appropriate curation of data—as well as being able to reuse it—is a key component of good science,” said Jay Hnilo, DOE Program Manager for Data Informatics. ESS-DIVE will create an integrated data environment and help to accelerate DOE’s science going forward, he added.
“We all want to extend our understanding from the sites that we are studying to as much of the Earth as possible, and connect our research with similar research at other sites,” Torn said. “This will allow us to speak a common language and have a broader impact.”
The ESS-DIVE team is composed of an interdisciplinary group of data scientists, digital librarians, and environmental scientists, as well the National Center for Ecological Analysis and Synthesis, a research center based at the University of California – Santa Barbara. Key Berkeley Lab personnel working on the project include Charuleka Varadharajan, Shreyas Cholia, Cory Snavely, Valerie Hendrix, Dan Gunter, and William Riley.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are Department of Energy Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.