Special Feature: Five Questions for Sudip Dosanjh
September 27, 2013
Sudip Dosanjh is Director of the National Energy Research Scientific Computing (NERSC) Center at Lawrence Berkeley National Laboratory. NERSC's mission is to accelerate scientific discovery at the U.S. Department of Energy's Office of Science through high performance computing and extreme data analysis. NERSC deploys leading-edge computational and data resources for over 4,700 users from a broad range of disciplines. As a result of this support for science, NERSC users published 1,900 research papers in 2012. To continue its mission, NERSC will be partnering with computer companies to develop and deploy pre-exascale and exascale systems during the next decade.
Previously, Dosanjh headed extreme-scale computing at Sandia National Laboratories. He was co-director of the Los Alamos/Sandia Alliance for Computing at the Extreme-Scale from 2008-2012. He also served on the U.S. Department of Energy's Exascale Initiative Steering Committee for several years. Dosanjh had a key role in establishing co-design, a process in which hardware architects and developers of scientific applications work together, as a methodology for reaching exascale computing. He has numerous publications on exascale computing, co-design, computer architectures, massively parallel computing and computational science. He earned his bachelor’s degree in engineering physics in 1982, his master’s degree (1984) and Ph.D. (1986) in mechanical engineering, all from the University of California, Berkeley.
In conjunction with the U.S. Department of Energy’s focus on supercomputing in the month of September, Dosanjh answers five questions about supercomputers, rewards and challenges of running NERSC, and the return on DOE's invesment in supercomputing.
Question: So what exactly is a supercomputer?
Sudip Dosanjh: When most people think of a supercomputer, they probably envision a very large system consisting of racks of processors and the attendant cabling, data storage, etc. And that is one definition. At NERSC, we look at supercomputers as tools for discovery. In this light, as with most tools, you really need skilled people to wield the tool so that it works at the highest efficiency and productivity. In our case, we have a very talented staff that can deploy and manage these very large machines, as well as provide the support, software and services to our 4,700 users. And while we do this day in and day out, we are also continuously looking ahead to provide the next generation of equipment and expertise. So, I guess I would say a supercomputer is a system of hardware, software and support all working at peak efficiency to advance science.
Q: Why should people be interested in supercomputing?
SD: There are two sides of this answer. The first is that many of the things we rely on in our everyday lives have been developed and made safer and better through the use of supercomputers. For example, automobile manufacturers have long used supercomputers to improve the safety of vehicles in a crash. The data collected by those crash test dummies and other sensors is fed into models to improve the safety design of cars. And those designs can be “crashed” and tweaked, then crashed and tweaked again on supercomputers, which is much cheaper than building and crashing real cars. Supercomputers are helping design new airplanes, develop new materials for energy conversion and water desalination, and create better consumer products. Supercomputers are also helping us make longer-range, more accurate weather forecasts.
From the Department of Energy perspective, we produce insight into complex problems facing society. We advance combustion research to improve the efficiency and reduce the level of pollution – and these results are made public so carmakers and aircraft firms can use that knowledge to make vehicles that go farther on less fuel, and are cleaner and quieter. We are also helping scientists design cleaner, renewable energy sources for the future. And better batteries that are lighter, but can store more energy longer. Our support for genomics research can help scientists learn why some people are more likely to get a disease than others. Researchers are studying global climate change – how is our climate changing and what can be done about it. The bottom line is that since NERSC is a government-funded research facility, all of the knowledge gained by our users benefits the public in one way or another.
Q: As head of the Department of Energy’s most scientifically productive supercomputing center, what do you find most rewarding? Most challenging?
SD: The biggest reward is seeing the results of our staff’s support of users and those come in the form of scientific breakthroughs. This is both a challenge and a reward in that we have the largest user base of any DOE supercomputing center – 4,700 users running700 different applications to conduct research on 600 different projects. The reward is seeing their work published in peer-reviewed journals – 1,900 papers and 17 journal cover stories in 2012. And those papers put the knowledge and insight gained by our users into the public discussion so that others may also learn from it.
I would say that one of the biggest challenges is meeting the demand for time on our supercomputers. As more and more scientists realize the value of computing, whether through modeling complex processes or analyzing massive datasets from experiments, there are more requests for more resources. We just can’t accommodate them all, and I think the situation is similar at other centers. What we can do, however, is work with our users to help them make the most efficient use of their allocations so they can get the most science from their allocations.
We hold regular requirements reviews with our user community so we can better plan how we will meet their supercomputing needs. Later this decade, the aggregate needs of our users will surpass exascale (that is, a billion, billion floating point operations per second) . It will be an extreme challenge for us to meet their needs.
Q: You have led a number of efforts to plot a roadmap to the future of supercomputing. What are the drivers for this? Where do you think we will end up in 10 years?
SD: The last major disruption in supercomputing occurred in the early 1990s when massively parallel computers with thousands of processors replaced vector supercomputers. There are similarities and differences between what happened then and the challenges facing us now. The fundamental building blocks we use, microprocessors and memory, are rapidly changing due to energy constraints. Parallelism in supercomputers is escalating rapidly with the use of manycore microprocessors and GPUs. Microprocessors in the next decade will have as much concurrency as the first massively parallel supercomputers. Another challenge is the continuing mismatch between our ability to compute and move data, the so-called “memory wall.” Many people have said that floating point operations are “free,” relative to the cost of moving data. These challenges are making it more difficult to program many supercomputers. This is an important issue for NERSC because of our broad user base and the large number of codes that run on our systems. We need to collaborate with computer companies to produce systems that are energy-efficient and highly programmable.
Another key issue facing NERSC is the growing importance of data in scientific discovery. Over the last four years, our users have transferred more data to NERSC than they have transferred out of NERSC. This data is largely from experiments and measurements. Our users move, store, share and analyze genomic, climate and cosmology data as well as information from large experimental facilities like accelerators and light sources. Many of the challenges facing computational simulation and extreme data are very similar. However, I do expect us to deploy supercomputers specifically designed for extreme data science.
Q: What does the Department of Energy get as a return on its investment in supercomputing?
SD: As I mentioned earlier, it’s the nation as a whole (and the world, by extension) that reaps the benefit of this investment. We provide unique resources to a very broad user community – half of them are at universities that typically cannot afford to deploy such computing resources. And we are supporting advances in many areas of science, though I only gave a few examples. But we are also advancing the state of supercomputing itself. Many of the approaches we have pioneered at NERSC have been adopted and adapted by other supercomputing centers to benefit their user communities. What I think the nation gets in return for investing in supercomputing is high quality, innovative research that plays a critical role in maintaining our country’s scientific and economic leadership.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.