InTheLoop | 10.29.2012
October 29, 2012
ESnet Revving Up to Unleash 100 Gbps National Science Network
In late 2012, ESnet will take the next step, upgrading its network to 100 gigabits per second (Gbps), replacing the multiple 10 Gbps links now making up the network’s backbone. Under the current schedule, the switchover to ESnet5 will be done in November 2012, with different regions being migrated in sequence. Upon its completion, ESnet5 will be the world's fastest science network. Read more.
Architects of the Exascale Enlist Proxy Apps to Mimic Big Codes
Achieving exascale computing will depend on many interacting parts, according to an article in ASCR Discovery. That requirement spawned the idea of co-design, in which developers consider the scientific problems to be solved as they create exascale architectures and how that architecture affects software design. To proceed with this approach, the Department of Energy (DOE) created three co-design centers — for materials, nuclear energy and combustion.
A key part of co-design resides in what are known as proxy applications, or proxy apps. A proxy app generally stands for something bigger than itself. It’s a piece of code stripped of as many nonessential lines as possible that mimics some feature of a large program. Alice Koniges of NERSC is one of the researchers quoted in the article. Read more.
Proxy apps were also on the agenda at the ASCR Exascale Research Conference that was held October 1–3 in Arlington, VA. CS staff who contributed to that conference include John Bell, Wes Bethel, Hank Childs, Sudip Dosanjh, Steve Hofmeyr, Costin Iancu, Alice Koniges, Lenny Oliker, Prabhat, John Shalf, Arie Shoshani, Brian van Straalen, Gunther Weber, Sam Williams, John Wu, and Kathy Yelick.
CRD’s Sean Peisert Shares Cyber Security Expertise at I3P Meeting
Sean Peisert, a research scientist in Berkeley Lab's Computational Research Division, recently gave a talk at the 10th Anniversary of the Institute for Information Infrastructure Protection (I3P), held Oct. 10 at the National Press Club in Washington, D.C. Peisert discussed the impact of I3P, a consortium of leading universities, national laboratories, and nonprofit institutions dedicated to strengthening the cyber infrastructure of the United States. The video “I3P: Ten Years of Impact,” which includes Peisert’s talk, is available here.
Peisert, a computer security researcher, has been involved with the I3P since 2007, when he was awarded a $150,000 I3P Research Fellowship. Today along with Deb Agarwal, he has continued his involvement as one of LBNL's two representatives to the I3P. At the anniversary event, Peisert spoke about his views of the role of the I3P, as well as how the I3P has impacted his own research at LBNL over time. In addition to Peisert, other CRD researchers who have been I3P Fellows include Robin Sommer (2006) and Sean Whalen (2010–11).
CRD, NERSC Researchers Contribute to APS Plasma Physics Meeting
The 54th Annual Meeting of the American Physical Society (APS) Division of Plasma Physics is being held this week, October 29–November 2, in Providence, Rhode Island. Contributions from CRD and NERSC researchers include:
- Phillip Colella (co-author): Continuum Kinetic Plasma Modeling Using a Conservative 4th-Order Method with AMR (poster)
- Phillip Colella, Daniel Martin, Peter McCorquodale, and co-authors: Axisymmetric Modeling of a Tokamak Edge with the Continuum Gyrokinetic Code COGENT (poster)
- Alice Koniges, Xuefei Yuan, Wangyi Liu, and co-authors: Enabling Fusion Codes for Upcoming Exascale Platforms (poster)
- Harinarayan Krishnan (co-author): Finite Time Lyapunov Exponents for magnetically confined plasmas (poster)
- Wangyi Liu, Alice Koniges, and co-authors: Using a Korteweg-Type Model for Modeling Surface Tension and Its Applications (poster)
- Burlen Loring (co-author): New Breakthroughs and Challenges in Kinetic Simulations of the Magnetosphere (mini-conference)
- Burlen Loring (co-author): Quantifying Properties of Collisionless Turbulence Through Wavelet Analysis (poster)
- Burlen Loring (co-author): Electron Kelvin-Helmholtz Instability and Generation of Demagnetized Electron Rings (poster)
CRD Researchers Contribute to AIChE Annual Meeting
The American Institute of Chemical Engineers (AIChE) is holding its annual meeting this week, October 28–November 2, in Pittsburgh, PA. Several CRD researchers co-authored papers that will be presented:
- Deb Agarwal (co-author): Enforcing Elemental Mass and Energy Balances for Reduced Order Models Generated from CFD Simulations
- Richard Martin, Jihan Kim, Christopher Rycroft, Maciej Haranczyk, and co-authors: Large-Scale Computational Screening of Adsorbent Materials for Carbon Capture
- Maciej Haranczyk (co-author): Integrating the Carbon Capture Materials Database with the Process Simulation Tools of the Carbon Capture Simulation Initiative
Alice Koniges Named Associate Editor of Journal of HPC Computing Applications
Alice Koniges of NERSC has been named an Associate Editor of the International Journal of High Performance Computing Applications. The goal of the journal is to provide a forum for the communication of original research papers and timely review articles on the use of high performance computers to solve complex modeling problems across a spectrum of disciplines. The emphasis is on experiences with the use of high performance computers. Berkeley Lab Deputy Director Horst Simon is also on the journal’s editorial board.
This Week’s Computing Sciences Seminars
Par Lab Talk: Can GPGPU Programming be Liberated from the Data-Parallel Bottleneck?
Monday, October 29, 2:00–3:30 pm, Soda Hall, Wozniak Lounge, UC Berkeley
Benedict Gaster, AMD
With the success of programming models such as Khronos' OpenCL and NVIDIA's Cuda, heterogeneous computing is going mainstream. However, these systems are low-level, even when considering them as systems programming models. They are effectively extended subsets of C99, limited to the type unsafe procedural abstraction that C has provided for more than 30 years. Computer systems programming has for more than two decades been able to do lot better. A further limitation of the OpenCL/Cuda programming models is that to date they have really reflected the GPU programming model of the previous decade: they have focused on fine grained data-parallel workloads. What about other work-loads, e.g. task-parallel workloads? In this talk we introduce a model of braided parallelism and an object-oriented (based on C++11) programming model for heterogeneous computing. Liberating GPGPU programming from its data-parallel bottleneck, while at the same time adopting modern programming abstractions.
Chisel: Constructing Hardware in a Scala Embedded Language
Tuesday, October 30, 1:00–2:00 pm, 50F-1647
Jonathan Bachrach, University of California, Berkeley
Chisel is a new open-source hardware construction language developed at UC Berkeley that supports advanced hardware design using highly parameterized generators and layered domain-specific hardware languages. Chisel is embedded in the Scala programming language, which raises the level of hardware design abstraction by providing concepts including object orientation, functional programming, parameterized types, and type inference. Chisel can generate a high-speed C++-based cycle-accurate software simulator, or low-level Verilog designed to pass on to standard ASIC or FPGA tools for synthesis and place and route.
Model Reduction of Nonlinear Parametric Dynamical Systems Using Fast Local Reduced-Order Bases Updates
Tuesday, October 30, 3:00–4:00 pm, 540A/B Cory Hall, UC Berkeley
David Amsallem, Stanford University
A new approach for nonlinear parametric model reduction based on the concept of local reduced-order bases will be presented. This methodology is particularly suited for problems characterized by different physical regimes or moving features such as discontinuities and fronts.
Instead of searching for a reduced-order model (ROM) solution as a linear combination of global basis vectors, the proposed methodology chooses to express the ROM solution in terms of local basis vectors. This results in the use of smaller reduced bases at a time, appropriate for each physical regime.
The solution space is partitioned off-line in sub-regions using a clustering algorithm, each region having a local basis assigned to it. During the on-line ROM simulation, the choice of local basis to be used is dictated by the solution space sub-region the solution is currently in. It will be shown that determining that sub-region can be achieved very efficiently, with a cost that does not scale with the dimension of the full solution space.
In order to achieve large speedups when running a ROM for a nonlinear system, it is necessary to use a hyper-reduction technique, so that the cost of solving the ROM does not scale with the size of the underlying high-fidelity model. In this work, the concept of a *local* hyper-reduction technique is introduced.
Finally, it will be demonstrated that the accuracy of the nonlinear ROM can be improved on-line by updating the local basis when switching bases, using an efficient SVD update algorithm. In the case of hyper-reduction, this update can be efficiently computed using an approximated metric determined off-line by the solution of a semi-definite programming problem.
The proposed methodology is applied to the prediction of the in-flight behavior of a commercial aircraft and the simulation of the behavior of a nonlinear parameterized micro-electro-mechanical device, demonstrating the potential of the method for achieving large speedups and good accuracy.
DREAM Seminar: Beyond the Hill of Multicores Lies the Valley of Accelerators
Tuesday, October 30, 4:10–5:00 pm, 540A/B Cory Hall, UC Berkeley
Aviral Shrivastava, UC Berkeley
The power wall has resulted in a sharp turn in processor designs, and they irrevocably went multi-core. Multi-cores are good because they promise higher potential throughput (and never mind the actual performance of your applications). This is because the cores can be made simpler and run at lower voltage resulting in much more power-efficient operation. Even though the performance of single-core is much reduced, the total possible throughput of the system scales with the number of cores. However, the excitement of multi-core architectures will only last so long. This is not only because the benefits of voltage scaling will reduce with decreasing voltage, but also because after some point, making a core simpler will only be detrimental and may actually increase power-efficiency. What next! How do we further improve power-efficiency?
Beyond the hill of multi-cores, lies the valley of accelerators. Accelerators: hardware accelerators (e.g., Intel SSE), software accelerators (e.g., VLIW accelerators), reconfigurable accelerators (e.g., FPGAs), programmable accelerators (CGRAs) are some of the foreseeable solutions that can further improve power-efficiency of computation. Among these, we find CGRAs, or Coarse Grain Reconfigurable Arrays a very promising technology. They are slightly reconfigurable (and therefore close to hardware), but are programmable (therefore usable as more general-purpose accelerators). As a result, they can provide power-efficiencies of up to 100 GOps/W, while being relatively general purpose. Although very promising, several challenges remain in compilation for CGRAs, especially because they have very little dynamism in the architecture, and almost everything (including control) is statically determined. In this talk, I will talk about our recent research in developing compiler technology to enable CGRAs as general-purpose accelerators.
Reduction of Linear Complementarily Problems to Bimatrix Games: Scientific Computing and Matrix Computations Seminar
Wednesday, October 31, 12:10–1:00 pm, 380 Soda Hall, UC Berkeley
Ilan Adler, UC Berkeley
In 1994, Christos Papadimitriou introduced the complexity class PPAD (Polynomial-time Parity Argument Directed) for problems whose solution is known to exist via a proof based on a certain directed acyclic graph. In particular, we will show in this talk that the majority of linear complementarily problems (LCP’s) which are processable by the Lemke algorithm can be shown to be in PPAD. The discovery that finding a Nash equilibrium for a bimatrix game (2-NASH) is PPAD-complete established the very surprising result that every LCP in PPAD can be reduced in polynomial time to a 2-NASH (which can, by itself, be formulated as an LCP). However, the ingeniously constructed reduction (which is designed for any PPAD problem) is very complicated and goes through several stages that involve reducing the given LCP to finding an approximate Brouwer fixed point of an appropriate function, followed by reducing the latter to 3-graphical NASH (using small polymatrix games to simulate the computation of certain simple arithmetic operations), and finally, reducing the 3-graphical NASH to 2-NASH. Thus, while of great theoretical significance, the reduction is not practical for actually solving an LCP via 2-NASH, and it does not provide a clear insight regarding the completeness of 2-NASH within the PPAD LCP’s. To address this concern, we develop a simple explicit reduction of Lemke processable LCP’s to 2-NASH problems. In particular, we show that the reduction is a bijection and discuss its implications for solving LCP’s via 2-NASH and for getting a deeper insight into these LCP’s.
Link of the Week: Women: Why Doesn’t IT Have iPad Appeal?
Over the last decade, a new generation of women have become more ambitious and career-minded than ever before. In fact, careers are now so fundamental to many young women, that a Marie Claire survey published this month showed that three quarters of twenty and thirty something respondents cite work as either “very important” or the “single most important thing” in their lives. On top of this, Citi reported that 36% of women it polled recently didn't factor marriage into their definition of ‘having it all', and 27% didn't include children. But with so many women prioritizing work, why on earth aren't more pursuing careers in IT? Kathryn Cave, editor at IDG Connect, investigates IT's poor image. Read more.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.