A-Z Index | Phone Book | Careers

A Comprehensive Look at High Performance Parallel I/O

Book Signing @ SC14! Nov. 18, 5 p.m. in Booth 1939

October 31, 2014

Contact: Linda Vu, +1 510 495 2402, lvu@lbl.gov


In the 1990s, high performance computing (HPC) made a dramatic transition to massively parallel processors. As this model solidified over the next 20 years, supercomputing performance increased from gigaflops—billions of calculations per second—to petaflops—quadrillions of calculations per second—generating a tsunami of data along the way.

In this era of “big data,” high performance parallel I/O—the way disk drives efficiently read and write information on HPC systems—is extremely important. Yet, the last book to summarize best practices in this area was written more than 10 years ago. To fill the void, Prabhat of the Lawrence Berkeley National Laboratory (Berkeley Lab) and Quincey Koziol of the HDF Group brought together leading practitioners, researchers, software architects, developers and scientists to contribute their insights for a new book called “High Performance Parallel I/O.”

Published on October 24, the editors will be signing copies of the book and answering questions in the Department of Energy’s booth (#1939) at SC14 (Supercomputing Conference, 2014) on Tuesday, November 18 at 5:15-6:00 p.m.

“This book provides a survey of the significant accomplishments in file systems, libraries and tools that have been developed for about a decade. These developments have now reached a state of relative maturity and are ripe for a treatment in book format,” writes Horst Simon, Deputy Director of Berkeley Lab in the book’s forward. “I highly recommend this timely book for computational scientists and engineers.”

According to Prabhat, the book is divided into six parts. The first part explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O hardware, middleware, and applications. It then traverses up the I/O software stack. The second part covers the file system layer and the third part discusses middleware and user-facing libraries.

Delving into real-world scientific applications that use the parallel I/O infrastructure, he notes that the fourth part presents case studies from particle-in-cell, stochastic, finite volume, and direct numerical simulations. The fifth part gives an overview of various profiling and benchmarking tools used by practitioners. The final part, addresses the implications of current trends in HPC on parallel I/O in the exascale world.

“This book is a soup to nuts collection of everything you need to know about high performance parallel I/O, from hardware all the way up into the software layers and future work,” says Koziol.

For more information about the book: http://www.crcpress.com/product/isbn/9781466582347

About the Editors:

Prabhat leads the NERSC Data and Analytics Services Group at Berkeley Lab. His main research interests include Big Data analytics, scientific data management, parallel I/O, HPC, and scientific visualization. He is also interested in atmospheric science and climate change. 


Quincey Koziol
is the director of core software development and HPC at The HDF Group, where he leads the HDF5 software project. His research interests include HPC, scientific data storage, and software engineering and management.

About Computing Sciences at Berkeley Lab

The Computing Sciences Area at Lawrence Berkeley National Laboratory(Berkeley Lab) provides the computing and networking resources and expertise critical to advancing Department of Energy Office of Science (DOE-SC) research missions: developing new energy sources, improving energy efficiency, developing new materials, and increasing our understanding of ourselves, our world, and our universe. ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities. NERSC and ESnet are both Department of Energy Office of Science National User Facilities. The Computational Research Division (CRD) conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation.

Berkeley Lab addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science. The DOE Office of Science is the United States' single largest supporter of basic research in the physical sciences and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.