As scientific discovery becomes increasingly data-driven, the ability to manage, integrate and effectively use scientific data has become a central challenge across research disciplines. On February 11, Berkeley Lab’s Scientific Data Division (SciData) brought together more than 70 researchers and data experts for a full day of panels and discussions on the infrastructure, practices, and collaborations needed to support the full lifecycle of scientific data. Speakers came from across Berkeley Lab and beyond, including colleagues from UC Berkeley, Oak Ridge National Laboratory (ORNL), Sandia National Laboratories (SNL), Princeton Plasma Physics Laboratory (PPPL), and industry partners such as VAST Data. The workshop’s five panels spanned the scientific data life cycle, from building infrastructure to designing tools and experiences that better support scientific users.

SciData Director Ana Kupresanin opened the day with a welcome address that invited attendees to consider how advances in scientific data management, user experience, and AI tools can collectively accelerate scientific discovery. Throughout the day the value of partnership and teamwork emerged as recurring themes. SciData Deputy Director Lavanya Ramakrishnan moderated the opening panel, which focused on the practices and tools needed to build strong interdisciplinary partnerships. Panelists emphasized that successful partnerships rely not only on technical expertise, but also on mutual respect, shared vision, and close collaboration across roles.
A separate session focused on user experience (UX) in scientific software and data tools. Panelists described the importance of designing systems that scientists can use effectively and intuitively; they emphasized that strong UX work depends on close collaboration between designers, developers, and domain scientists. UX researcher Drew Paine reflected on the collaborative nature of the STRUDEL project, noting that UX work creates meaningful connections with scientists across disciplines.
Additional sessions explored data management, infrastructure, and “self-driving” science workflows, with panelists representing domains ranging from biology and networking to plasma physics. Shantenu Jha, the head of the PPPL’s Computing Sciences Department highlighted the promise of AI-enabled workflows for accelerating complex simulations. “Plasma physics involves some of the most computationally intensive simulations,” he said, noting that even major increases in computing capacity may still fall short of the time and resolution scales required. AI-enabled HPC workflows, he explained, could help accelerate these calculations and expand the scope of fusion research.
Kjiersten Fagnan, Chief Information Officer at the Joint Genome Institute, underscored both the opportunities and the challenges that accompany the growing adoption of these new approaches across the national labs.”There is a proliferation of platforms, agents, and data sources all targeting scientists,” she observed, “Our challenge is to either create robust partnerships or outcompete these contenders by leveraging the collective creative intelligence of the national laboratories.”

The workshop also served as an informal send-off for Ramakrishnan, who is departing Berkeley Lab after more than 15 years. She joined the Lab in 2009 as an Alvarez Fellow and went on to lead numerous projects including STRUDEL, an open-source initiative that champions the integration of User Experience (UX) practices and tools into scientific software development. She also served as Deputy Project Director of the High Performance Data Facility. Throughout the day, colleagues reflected warmly on their time working alongside her. Former SciData Division Director Deb Agarwal praised Ramakrishnan as a foundational leader.
To learn more about the partnerships and projects in the SciData Division, please visit our Research Areas webpage.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab's Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.