Workflow and Data Management
Experimental, observational, and theory-based scientific data presents complex challenges in quality assurance, quality control, data integration, data reduction, analysis and visualization, real-time decision-making, human-in-the loop, monitoring, and steering. To address these challenges, we create integrated systems to manage the data coming from instruments and computations. These systems perform streaming data analytics over high-speed wide-area networks such as the Energy Sciences Network (ESnet), processing data in real time at high-performance computing (HPC) facilities such as the National Energy Research Scientific Computing Center (NERSC). We employ statistical methods and/or surrogate models to process the data concurrently and close the loop to communicate back to the data source. To store and archive the data for offline analysis, we create intelligent and automated data- and resource-management architectures.
The Superfacility concept is a framework for integrating experimental and observational instruments with computational and data facilities. Data produced by light sources, microscopes, telescopes, and other devices can stream in real time to large computing facilities where it can be analyzed, archived, curated, combined with simulation data, and served to the science user community via powerful computing, storage, and networking systems. The NERSC Superfacility Project is designed to identify the technical and policy challenges in this concept for an HPC center. Contact: Debbie Bard
In Earth and environmental sciences, applications such as resilient infrastructure, predictions of ecosystem responses to climate change and disturbance, and monitoring of energy resources need to collect heterogeneous data from field environments and infer scientific insights or make decisions based on predictions generated by models. This project develops a solid foundation for self-guiding field laboratories (SGFL) to facilitate adaptive measurements driven in near-real-time by synthesizing lab and field observations with models. Contact: Yuxin Wu
Spearheaded by DOE's ESnet, new investments in local 5G connectivity and remote satellite data backhaul is providing a high-tech road map for the future of data in field research. Read More »
Researchers at NERSC and the Linac Coherent Light Source at SLAC are collaborating to leverage the superfacility model for real-time data analysis in the worldwide quest to decipher the SARS-CoV-2 virus. Read More »
For more than a decade, a team of international researchers led by Berkeley Lab bioscientists has been studying Photosystem II, a protein complex in green plants, algae, and cyanobacteria that plays a crucial role in photosynthesis. They’re now moving more quickly toward an understanding of this three-billion-year-old biological system, thanks to an integrated superfacility framework established between LCLS, ESnet, and NERSC. Read More »