HPDF is a partnership between Jefferson Lab in Newport News, Virginia and Berkeley Lab in Berkeley, California.
The High Performance Data Facility (HPDF), a first-of-its-kind distributed initiative, will be a state-of-the-art user facility and resource for scientific research. The HPDF Project is sponsored and supported by the Department of Energy’s Office of Science’s (DOE SC) Advanced Scientific Computing Research (ASCR) program. HPDF is a partnership between Thomas Jefferson National Accelerator Facility (Jefferson Lab) and Lawrence Berkeley National Laboratory (Berkeley Lab). The HPDF Hub infrastructure will be sited at each lab’s campus providing resilient data infrastructure to the SC community. HPDF will be a DOE SC user facility that adds world-class capabilities to the ASCR and SC computing infrastructure ecosystem.
The mission of HPDF is to enable and accelerate scientific discovery by delivering expert data management infrastructure, capabilities, and tools. As part of the ASCR Facilities Ecosystem, the HPDF Facility will provide essential data management capabilities, including hardware, software, and experts to manage the large-scale scientific data produced at DOE user facilities and beyond. HPDF will be a critical resource for enabling the frontier of AI-enabled scientific discovery.
The HPDF Project is an Infrastructure Partner (IP) in the American Science Cloud.
How HPDF Will Impact Scientific Communities
The HPDF Hub, a part of the HPDF Project, will be a cutting-edge high-performance data platform that provides advanced storage, processing, and analytical services, empowering data-intensive research and AI-driven discoveries.
HPDF will offer innovative production data services and software tools to support the entire data lifecycle. A key goal is facilitating data management and interoperability, i.e., making data available to a broad scientific community, providing for new technologies and usage patterns, and preserving data for future use. HPDF will incorporate core capabilities to spearhead scientific data-driven discovery and provide access to data through various services (e.g., data catalogs, search tools) and interfaces (e.g., AI agents, web, and application programming interfaces). The HPDF team will be able to support users through a variety of data lifecycle needs including:
- Data storage, access, and discovery
- Data life cycle tools and services
- Data preservation
- Seamless data and compute infrastructure
The HPDF Berkeley Lab Team
The HPDF team at Berkeley Lab is located in Shyh Wang Hall. Berkeley Lab’s HPDF team is committed to supporting scientific research patterns and data stewardship within the nation’s research communities.
The Berkeley HPDF team includes:
Leadership
- Ana Kupresanin, Deputy Project Director
- Becci Totzke, Deputy Project Manager
- Shane Canon, Deputy Technical Director
Support
- Jason Salinas, Administrative Support
- Ingrid Ockert, Communications
Technical Team
- Karlo Berket
- Shreyas Cholia
- Hannah Cohoon
- Dan Gunter
- Valerie Hendrix
- Patrick Huck
- Ben Maxwell
- Drew Paine
HPDF Project Status & Plan
This overview of the HPDF project explores the origins of the project and its mission. It also provides information about its intended physical design, infrastructure design, and scope.
Initial User Study Insights
The HPDF User Experience (UX) team conducted eight semi-structured interviews with twelve individuals leading data and computing infrastructure work at the DOE Office of Science (SC) user facilities and projects. This report presents key takeaways synthesizing community perspectives and needs, along with recommendations for the project.
Community Requirements Meta-Analysis
This meta-analysis examines the needs of the breadth of the SC community, captured in publicly available community reports or mission documents. The meta-analysis identifies and provides initial characterization of fifteen core requirements for the HPDF Project team to consider during the conceptual design phase.
