The volume, veracity, and velocity of data generated by scientific tools have grown exponentially in the last decade. This boom has fundamentally changed the scientific workflow running on high performance computing (HPC) systems.
Lavanya Ramakrishnan, senior scientist and deputy director of the Scientific Data Division, thinks a lot about workflows and the data lifecycle challenges on HPC systems, including the effective use of storage hierarchy, managing complex scientific data processing, and enabling search on large-scale scientific data.
In a June 3 keynote at the 25th annual Workshop on Job Scheduling Strategies for Parallel Processing, Ramakrishnan explained that new workflow characteristics coming from data impacts the design of next generation infrastructure. Ramakrishnan also presented on data lifecycle challenges in HPC in an invited talk at ESSA 2022, the 3rd Workshop on Extreme-Scale Storage and Analysis on the same day.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab's Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.