Computing Sciences Summer Students: 2018 Talks & Events
Who: Steve Leak and Brandon Cook
When: June 07, 10:00 a.m. - 11:30 a.m.
Abstract: What is High-Performance Computing (and storage!), or HPC? Who uses it and why? We'll talk about these questions as well as what makes a Supercomputer so super and what's so big about scientific Big Data. Finally we'll discuss the challenges facing system designers and application scientists as we move into the many-core era of HPC. After the presentation, we'll take a tour of the NERSC machine room. (Closed-toe shoes are REQUIRED for the machine room tour.)
Bios: Steve Leak joined the NERSC User Engagement team in 2016 and focuses on application performance and user experience of the HPC systems, including support, documentation and the batch system. His history includes batch system and user support for HPC at New York University, scientific application debugging and performance tuning at Australia's CSIRO and pre-sales benchmarking and tuning for NEC vector supercomputers. Brandon Cook is an HPC consultant in the Application Performance Group at NERSC. He serves as the liaison to several teams preparing for the new Cori machine architecture at NERSC, and also active in analyzing jobs and engaging NERSC users through consulting. He earned his PhD in physics from Vanderbilt University in 2012, where he studied ab initio methods for quantum transport in nanomaterials. Before joining NERSC he was a postdoc at Oak Ridge National Laboratory where he developed and applied electronic structure methods to problems in material science.
Who: Helen He
When: June 11, 2:00 p.m. - 4:00 p.m.
Where: 54-130-Pers Hall
Abstract: This class will provide an informative overview to acquaint students with the basics of NERSC computational systems and its programming environment. Topics include: systems overview, connecting to NERSC, software environment, file systems and data management / transfer, and available data analytics software and services. More details on how to compile applications and run jobs on NERSC Cori/Edison will be presented including hands-on exercises. The class will also showcase various online resources that are available on NERSC web pages. (Students should bring their laptops.)
Bio: Helen is a High Performance Computing consultant of the User Engagement Group at NERSC. She has been the main point of contact among users, system people, and vendors, for the Cray XT4 (Franklin), XE6 (Hopper) systems, and XC40 (Cori) systems at NERSC in the past 10 years. Helen has worked on investigating how large scale scientific applications can be run effectively and efficiently on massively parallel supercomputers: design parallel algorithms, develop and implement computing technologies for science applications. She provides support for climate users and some of her experiences include software programming environment, parallel programming paradigms such as MPI and OpenMP, scientific applications porting and benchmarking, distributed components coupling libraries, and climate models.
Who: Julian Borrill
When: June 14, 11 a.m. - 12 p.m.
Abstract: The Cosmic Microwave Background (CMB) is the last echo of the Big Bang, and carries within it the imprint of the entire history of the Universe. Decoding this preposterously faint signal requires us to gather every-increasing volumes of data and reduce them on the most powerful high performance computing (HPC) resources available to us at any epoch. In this talk I will describe the challenge of CMB analysis in an evolving HPC landscape.
Bio: Julian Borrill is the Group Leader of the Computational Cosmology Center at LBNL. His work is focused on developing and deploying the high performance computing tools needed to simulate and analyse the huge data sets being gathered by current Cosmic Microwave Background (CMB) polarization experiments, and extending these to coming generations of both experiment and supercomputer. For the last 15 years he has also managed the CMB community and Planck-specific HPC resources at the DOE's National Energy Research Scientific Computing Center.
Who: Rebecca Hartman-Baker and Zhengji Zhao
When: June 15, 9:00 a.m. - 1200 p.m & 2:00 p.m. - 5:00 p.m.
Abstract: In this course, students will learn to write parallel programs that can be run on a supercomputer. We begin by discussing the concepts of parallelization before introducing MPI and OpenMP, the two leading parallel programming libraries. Finally, the students will put together all the concepts from the class by programming, compiling, and running a parallel code on one of the NERSC supercomputers. (Students should bring their laptops.)
Bios: Rebecca Hartman-Baker is the acting leader of the User Engagement Group at NERSC. She got her start in HPC as a graduate student at the University of Illinois, where she worked as a graduate research assistant at NCSA. After earning her PhD in Computer Science, she worked at Oak Ridge National Laboratory in Tennessee and the Pawsey Supercomputing Centre in Australia before joining NERSC in 2015. Rebecca's expertise lies in the development of scalable parallel algorithms for the petascale and beyond. Zhengji Zhao joined the NERSC Division after working for three years as a postdoc in the Computational Research Division's, where she developed two new methods for computational nanoscience: the linear scaling 3D fragment (LS3DF) method for large-scale electronic structure calculations, and a new and fast motif-based Hessian matrix method to estimate a preconditioner for nanostructures. Zhengji received her Ph.D. in computational physics from New York University for developing the reduced density matrix (RDM) method, a highly accurate alternative to wavefunction-based computational chemistry methods.
Who: Jonathan Carter
When: June 22, 11:00 a.m. - 12:00 p.m.
Abstract: During the poster session on August 3rd, members of our summer visitor program will get the opportunity to showcase the work and research they have been doing this summer. Perhaps some of you have presented posters before, perhaps not. This talk will cover the basics of poster presentation: designing an attractive format; how to present your information clearly; what to include and what not to include. Presenting a poster is different from writing a report or giving a presentation. This talk will cover the differences, and suggest ways to avoid common pitfalls and make poster sessions work more effectively for you.
Bio: Before assuming the Deputy role for CS, Dr. Carter was leader of the NERSC User Services Group (USG). He joined NERSC as a consultant in USG at the end of 1996, helping users learn to effectively use the computing systems. He became leader of USG at the end of 2005. Carter maintains an active interest in algorithms and architectures for high-end computing, and participates in benchmarking and procurement activities to deploy new systems for NERSC. In collaboration with the Future Technologies Group in CRD, and the NERSC Advanced Technology Group, he has published several architecture evaluation studies, and looked at what it takes to move common simulation algorithms to exotic architectures. His applications work on the Japanese Earth Simulator earned him a nomination as Gordon Bell Prize finalist in 2005. Prior to LBNL, Dr. Carter worked at the IBM Almaden Research Center as a developer of computational chemistry methods and software, and as a researcher of chemical problems of interest to IBM. He holds a Ph.D. and B.S in chemistry from the University of Sheffield, UK, and performed postdoctoral work at the University of British Columbia, Canada.
When: June 26, 11:00 a.m. - 12:00 p.m.
Abstract: This talk will review progress in Artificial Intelligence (AI) and Deep Learning (DL) systems in recent decades. We will cover successful applications of DL in the commercial world. Closer to home, we will review NERSC’s efforts in deploying DL tools on HPC resources, and success stories across a range of scientific domains. We will touch upon the frontier of open research/production challenges and conjecture about the role of humans (vis-a-vis AI) in the future of scientific discovery.
Bio: Prabhat leads the fantastic Data and Analytics Services team at NERSC.
Who: Mariam Kiran
When: June 28, 11 a.m. - 12 p.m.
Abstract: Machine learning is bringing innovative solutions in self-driving cars, drones and more. Mostly efforts led by industry, products such as Amazon's Alexa, Siri and many more boast doing extra artificial intelligence to tailor to the needs of users. On the other hand, wide area networks are very complex systems. These involve many actors, users and instruments involved. We have been exploring what machine learning is and how its algorithms can be used in wide area networks.
Bio: Mariam’s research focuses on learning and decentralized optimization of system architectures and algorithms for high performance computing, underlying networks and Cloud infrastructures. She has been exploring various platforms such as HPC grids, GPUs, Cloud and SDN-related technologies. Her work involves optimization of QoS, performance using parallelization algorithms and software engineering principles to solve complex data intensive problems such as large-scale complex simulations. Over the years, she has been working with biologists, economists, social scientists, building tools and performing optimization of architectures for multiple problems in their domain.
Who: Danielle Christianson
When: July 10, 11:00 a.m. - 12:00 p.m.
Abstract: Will tropical forests alleviate or exacerbate global environmental change? Will the California’s central valley be able to support agricultural production into the future? How will the changing environment affect fresh water supplies for US cities? To study such earth science questions, researchers collect a wide variety of observations from satellite imagery of vegetation change across the globe to CO2 concentrations in almond orchards to soil microbe composition in a mountain meadow. Furthermore, such complex questions require that teams of scientists from different disciplines collaborate and share data. Organizing these diverse data into comparable and sharable formats is both a technical and human-based challenge. In this talk, examples of data management frameworks and tools from the NGEE Tropics and AmeriFlux projects are used to illustrate the advances needed in computer science to facilitate better earth science research.
Bio: Danielle is a postdoctoral scholar in the Computational Research Division at Berkeley Lab. She develops data management tools for earth system science research. Danielle completed her PhD in the Energy and Resources Group and Berkeley Center for New Media at UC Berkeley where she studied the variability and effects of microclimate on forests in mountain systems. She also studies the influences of digital technologies on earth science research practice. Before studying at Berkeley, Danielle worked at NASA’s Jet Propulsion Laboratory designing miniature chemical sensors for extraterrestrial life detection.
Who: Suren Byna
When: Jul 12, 11:00 a.m. - 12:00 p.m.
Abstract: Science is driven by massive amounts of data. This talk will review data management techniques used on large-scale supercomputing systems. The topics include: I/O software stack for storing and loading data to and from parallel file systems, querying data using array abstractions, and ongoing research in object storage for supercomputing systems.
Bio: Suren Byna is a Staff Scientist in the Scientific Data Management (SDM) Group in CRD @ LBNL. His research interests are in scalable scientific data management. More specifically, he works on optimizing parallel I/O and on developing systems for managing scientific data. He is the PI of the ECP funded ExaHDF5 project, and ASCR funded object-centric data management systems (Proactive Data Containers - PDC) and experimental and observational data management (EOD-HDF5) projects.
Who: Juliane Mueller
When: July 19, 11 a.m. - 12 p.m.
Abstract: Simulation models are used by scientists in various disciplines, e.g., climate, cosmology, engineering, materials science, to approximate physical phenomena that could otherwise not be studied. Every simulation model contains parameters that must be optimized with some objective in mind. For example, the goal (the objective function) might be to tune the parameters such that the error between the simulation model’s output and the observation data is minimized or such that the resulting design of an airfoil maximizes its lift. However, the difficulty is that simulation models take a long time to run and we generally do not have an algebraic description of the objective function. This makes the parameter optimization task extremely challenging. In this talk, I will give a high level overview of how these problems can be efficiently and effectively solved using surrogate models and, if time allows, I will briefly talk about some application problems.
Bio: Juliane Mueller has been a research scientist in the Computational Research Division at Berkeley Lab since 2017. Her work focuses on the development of algorithms for difficult optimization problems. Juliane received her Ph.D. in Applied Mathematics from Tampere University of Technology in 2012 after which she did a postdoc at Cornell University. Juliane joined Berkeley Lab as Alvarez Fellow in 2014.
Who: Hari Krishnan
When: July 26, 11 a.m. - 12 p.m.
Abstract: Large DOE research projects often require complex and stringent constraints on resources whether human or computational. Handling data acquired from experimental, observed, or simulated sources and running analysis on heterogeneous architectures requires comprehensive understanding of the analysis and data movement process. This talk will highlight examples of two scientific endeavors within Department of Energy (DOE) research programs: Environmental Management and Advanced Light Sources. These two domains serve to highlight diverse challenges science communities face which require very different engineering frameworks to effectively address. The presentation will cover solutions to several technical challenges encountered while understanding and developing end to end pipelines, including issues with data movement, types of analysis, reconstruction, visualization, and data access while focused on providing usable scientific results.
Bio: Hari Krishnan has a Ph.D. in Computer Science and works for the visualization and graphics group as a computer systems engineer at Lawrence Berkeley National Laboratory. His research focuses on scientific visualization on HPC platforms and many-core architectures. He leads the development effort on several HPC related projects which include research on new visualization methods, optimizing scaling and performance on NERSC machines, working on data model optimized I/O libraries and enabling remote workflow services. As a member of The Center for Advanced Mathematics for Energy Research Applications (CAMERA), he supports development of the software infrastructure, works on accelerating image analysis algorithms and reconstruction techniques. He is also an active developer of several major open source projects which include VisIt, NiCE, H5hut, and has developed plugins that support performant-centric scalable image filtering and analysis routines in Fiji/ImageJ.
Who: Laurie Chong
When: Jul 19, 3-4 PM
Where: Molecular Foundry (MF) Facility, building 67
The MF is a nanoscience research facility that provides users access to cutting-edge expertise and sophisticated instrumentation in a collaborative and multidisciplinary environment. Research done at the MF ranges from the mapping of battery materials with atomic precision to the structure of proteins that help injured lungs breathe.
Who: Michael Banda, Simon Leeman and Jonah Weber
When: Jul 25, 3-4 PM
Where: Advanced Light Source (ALS) Facility, building 6
The ALS is a third-generation synchrotron facility that hosts experiments in a wide variety of fields. Research done at the ALS ranges from the determination of molecular structures of human antibodies bound to a respiratory virus protein to the chemistry of the durable concrete used by Romans 2000 years ago.