A-Z Index | Directory | Careers

Workflow and Data Management

Experimental, observational, and theory-based scientific data presents complex challenges in quality assurance, quality control, data integration, data reduction, analysis and visualization, real-time decision-making, human-in-the loop, monitoring, and steering. To address these challenges, we create integrated systems to manage the data coming from instruments and computations. These systems perform streaming data analytics over high-speed wide-area networks such as the Energy Sciences Network (ESnet), processing data in real time at high-performance computing (HPC) facilities such as the National Energy Research Scientific Computing Center (NERSC). We employ statistical methods and/or surrogate models to process the data concurrently and close the loop to communicate back to the data source. To store and archive the data for offline analysis, we create intelligent and automated data- and resource-management architectures.


The Superfacility Project

The Superfacility concept is a framework for integrating experimental and observational instruments with computational and data facilities. Data produced by light sources, microscopes, telescopes, and other devices can stream in real time to large computing facilities where it can be analyzed, archived, curated, combined with simulation data, and served to the science user community via powerful computing, storage, and networking systems. The NERSC Superfacility Project is designed to identify the technical and policy challenges in this concept for an HPC center. Contact: Debbie Bard

Toward Self-Guiding Field Laboratories to Support Water Management

In Earth and environmental sciences, applications such as resilient infrastructure, predictions of ecosystem responses to climate change and disturbance, and monitoring of energy resources need to collect heterogeneous data from field environments and infer scientific insights or make decisions based on predictions generated by models. This project develops a solid foundation for self-guiding field laboratories (SGFL) to facilitate adaptive measurements driven in near-real-time by synthesizing lab and field observations with models. Contact: Yuxin Wu 


X-Ray Crystallography Goes Even Tinier

March 11, 2022

Supported by high-performance computing resources at NERSC, scientists at Berkeley Lab have debuted a new form of X-ray crystallography for small molecules not previously conducive to investigation with crystallography. Read More »

Collaboration Charts New Course for Future of Field Research Data

February 2, 2022

Spearheaded by DOE's ESnet, new investments in local 5G connectivity and remote satellite data backhaul is providing a high-tech road map for the future of data in field research. Read More »

Superfacility Model Brings COVID Research Into Real Time

February 8, 2021

Researchers at NERSC and the Linac Coherent Light Source at SLAC are collaborating to leverage the superfacility model for real-time data analysis in the worldwide quest to decipher the SARS-CoV-2 virus. Read More »

Superfacility Framework Advances Photosynthesis Research

May 2, 2019

For more than a decade, a team of international researchers led by Berkeley Lab bioscientists has been studying Photosystem II, a protein complex in green plants, algae, and cyanobacteria that plays a crucial role in photosynthesis. They’re now moving more quickly toward an understanding of this three-billion-year-old biological system, thanks to an integrated superfacility framework established between LCLS, ESnet, and NERSC. Read More »