InTheLoop | 10.01.2012
October 1, 2012
Berkeley Lab to Lead, Support 14 SciDAC Projects
When the Department of Energy announced the series of projects under the latest Scientific Discovery through Advanced Computing (SciDAC) program, Berkeley Lab staff were listed as key contributors in three institutes and 11 science application partnerships. Funding for the projects is expected to total about $6 million annually over the next three to five years. Read more.
Berkeley Lab Scientists to Help Develop Software for Exascale Supercomputers
With petascale supercomputers—systems capable of performing quadrillions of operations per second—now becoming the norm in high-end scientific computing, exascale systems are now on the horizon. Expected to become available by the end of this decade, exascale supercomputers will be 1,000 times faster than today’s petascale machines. To address the challenge of developing a software stack for exascale systems, the U.S. Department of Energy (DOE) is funding a number of research efforts under the X-Stack program. Computer scientists in the Computing Sciences organization at Lawrence Berkeley National Laboratory will contribute their expertise to three X-Stack projects. Read more.
End-to-End Network Tuning Sends Data Screaming from NERSC to NOAA
As an associate scientist in the Earth System Research Lab of the National Oceanic and Atmospheric Administration (NOAA), Gary Bates has transferred hundreds of thousands of files to and from NERSC, as part of a "reforecasting" weather forecasting project. Earlier this year, Bates successfully transferred 170 terabytes of data from NERSC back to NOAA Boulder at a whopping rate of 395 megabytes per second, with help from NERSC and ESnet staff. Read more.
ESnet to Hold ASCR Requirements Review This Week
ESnet is conducting an ASCR Requirements Review Thursday and Friday, October 4–5, in the DOE offices in Germantown, MD. The goal of this review is to accurately characterize the current and future networking requirements of science conducted using the ASCR supercomputer centers — ALCF, NERSC, and OLCF. In particular, we are interested in network-intensive science programs that will need to be supported by the ASCR supercomputer centers regardless of whether particular users receive allocations in a given year, or programs that have significant investments in computational resources that are likely to be continued for some time.
These requirements will serve as input to the ESnet architecture and planning processes, and will help ensure that ESnet continues to provide world-class support for scientific discovery for DOE scientists and their collaborators. The tangible outcome of the review will be a document that includes both the network requirements and a supporting narrative.
Greg Bell, Eli Dart, Lauren Rotman, and Brian Tierney will be representing ESnet. David Skinner will represent NERSC.
Workshop on Exascale Operating Systems and Runtime Software This Week
A Workshop on Exascale Operating Systems and Runtime Software is being held October 4–5 in Washington, DC. The goal of this workshop is to explore fundamental computer science research directions in operating systems and runtime (OS/R) software, engaging the research community on the exascale OS and runtime software roadmap.
The workshop will review key challenges that have been previously identified, then will focus on understanding potentially disruptive solutions and research directions that could overcome the identified challenges. Participants will identify and discuss revolutionary approaches for exascale OS and runtime systems, articulating research priorities. They will also discuss an integration plan that involves the research community and vendor developers. The workshop discussions will enable a DOE roadmap for the research and development of exascale platforms OS and runtime software, including prioritized areas of investment, timelines, and scale of investment.
John Shalf, head of the Computer and Data Sciences Department in CRD, will moderate two sessions, on “Global OS and Node OS Architectures,” and on “Memory.” Steven Hofmeyr of CRD’s Future Technologies Group will moderate the “Fine Grain” session. NERSC staff also provided input to the meeting.
CRD’s Erich Strohmaier Writes Article on How the TOP500 List Gives Insight into HPC Industry
Erich Strohmaier, leader of Future Technologies Group in the Computational Research Division, has written a cover story for Scientific Computing magazine on how the twice-yearly TOP500 list of the world’s top supercomputers provides a good look at the state of HPC technology. As one of the founding editors of the list, Strohmaier has compiled 39 such lists since 1993. While most of the HPC community focuses on the No. 1 spot and the top 10 systems, by digging deeper, a complete picture of what’s happening in high performance computing emerges. In the latest list from June 2012, the two lists illustrate how technology is refreshed, a process that is not always uniform. The primary reason for the different refresh rates is that the TOP500 List encompasses two HPC populations.
Could Supercomputing Turn to Signal Processors (Again)?
An article in IEEE Spectrum reports that building high-performance computers used to be all about maximizing flops, or floating-point operations per second. But the engineers designing today’s high-performance systems are keeping a close eye not just on the number of flops but also on flops per watt.
Judged by that energy-efficiency metric, some digital-signal processing (DSP) chips—the sophisticated signal conditioners that run our wireless networks, among other things—might make promising building blocks for future supercomputers, recent research suggests. But John Shalf of CRD cautions that double-precision calculations and the subsystems required for a supercomputer would significantly lower the machine’s overall energy efficiency. Read more.
Free Public Online Class: An Introduction to Computer Networks
Stanford University is offering a free online class, “Introduction to Computer Networks.” This is an introductory course on computer networking, specifically the Internet, targeted at anyone with a basic understanding of computer science and a familiarity with basic probability. It focuses on explaining how the Internet works, ranging from how bits are modulated on wires and in wireless to application-level protocols like BitTorrent and HTTP. It also explains the principles of how to design networks and network protocols. Students gain experience reading and understanding RFCs (Internet protocol specifications) as statements of what a system should do. The course grounds many of the concepts in current practice and recent developments, such as net neutrality and DNS security.
The course will be available starting October 8 and will run for 10 weeks. Registration is free for anyone who wants to watch the series of about thirty 15–20 minute videos, including short embedded quizzes to help students learn the material. There will be a couple of problem sets, but no programming assignments. All students who complete the course will receive a letter of completion but no official credit from Stanford. Go here to sign up.
This Week’s Computing Sciences Seminars
Monday, October 1, 4:00–5:30 pm, Banatao Auditorium, Sutardja Dai Hall, UC Berkeley
Social media has changed the landscape of American politics. Candidates are using more sophisticated social media strategies and voters are communicating more actively among themselves. By one measure, between April and August this year almost 600,000 videos mentioning Obama or Romney had been posted on YouTube, quadruple the number posted during the same period in the 2008 election. But does more information—and a Twitter-speed news cycle—contribute to more considered opinions or simply more noise? How does the model of crowd-sourced political dialogue shape campaign agendas and communication strategies? Do new technologies help us talk across party lines, or do they contribute to more polarization?
Join us for a discussion with distinguished experts in politics and social media. A reception and exhibit of election-related apps will follow the presentation. Panelists will include:
- David All, Founder, David All Group
- Daniel Kreiss, Assistant Professor of Journalism and Mass Communication, University of North Carolina, and author of Taking Our Country Back: The Crafting of Networked Politics from Howard Dean to Barack Obama
- Theo Yedinsky, President of Social Stream and Vice President of Sales for North Social
Sponsored by the Robert T. Matsui Center for Politics and Public Service. Co-sponsored by the Berkeley Center for New Media and the Data, Berkeley College Republicans, Democracy Initiative at the Center for Information Technology Research in the Interest of Society, Center on Civility and Democratic Engagement (GSPP), and UC Berkeley School of Information.
Maximum Likelihood for Matrices with Rank Constraints: Scientific Computing and Matrix Computations Seminar
Wednesday, October 3, 12:10–1:00 pm, 380 Soda Hall, UC Berkeley
Bernd Sturmfels, UC Berkeley
Maximum likelihood estimation is a fundamental computational task in statistics. We discuss this problem for manifolds of (positive) matrices of bounded rank. These represent mixtures of independent distributions of two discrete random variables. This non-convex optimization leads to some beautiful geometry, topology, and combinatorics. We employ tools from numerical algebraic geometry (the software Bertini) to find the global maximum of the likelihood function. This is joint work with Jon Hauenstein and Jose Rodriguez.
Bayesian Inference and Data Assimilation with Optimal Maps
Wednesday, October 3, 4:00–5:00 pm, 939 Evans Hall, UC Berkeley
Youssef Marzouk, MIT
We present a new approach to Bayesian inference that entirely avoids Markov chain simulation, by constructing a map that pushes forward the prior measure to the posterior measure. Existence and uniqueness of a suitable measure-preserving map is established by formulating the problem in the context of optimal transport theory. We discuss various means of explicitly parameterizing the map and computing it efficiently through solution of a stochastic optimization problem. The resulting algorithm overcomes many of the computational bottlenecks associated with Markov chain Monte Carlo. Advantages include analytical expressions for posterior moments, automatic evaluation of the marginal likelihood, clear convergence criteria, and the ability to generate independent uniformly weighted posterior samples without additional model evaluations. We also discuss extensions of the map approach to hierarchical Bayesian models and to problems of sequential data assimilation, i.e., filtering and smoothing. Numerical demonstrations include parameter inference in ordinary and partial differential equations and in spatial statistical models, as well as state estimation in nonlinear dynamical systems.
LBNL Integrated Bioimaging Initiative Seminar: Science Gateways, Trends in Data-Centrism and the Web
Wednesday, October 3, 4:00–5:00 pm, 717 Potter St., room 141
David Skinner, NERSC
As data collected from simulations and experiments grows ever larger science teams require reliable and high performance means to analyze data at a distance. The notion of feeling secure about data only when we can hold our own entire copy in personal storage must give way to central shared data storage. This transition in scientific data management affords force multiplying benefits in terms of collaboration. Web based science gateways sited at large scale computing and data facilities can provide a set of methods for remote analysis, data sub-selection, and data sharing for wide ranging data-centric collaboration. In this talk we review the last two years of NERSC’s Science Gateways program and offer ideas for the future of computational bioimaging at LBNL.
EECS Colloquium: End-User Programming and Intelligent Tutoring Systems
Wednesday, October 3, 4:00–5:00 pm, 306 Soda Hall (HP Auditorium), UC Berkeley
Sumit Gulwani, Senior Researcher, RiSE Group, Microsoft Research
Millions of end users today have access to programmable environments such as spreadsheets and smartphones, but lack the programming expertise to write even small scripts. These users can effectively communicate their intent using examples and natural language. Our methodology involves designing a domain-specific language (DSL), developing a synthesis algorithm for learning programs in the DSL that match the user’s (often under-specified) intent, and using machine learning to rank these programs. In this talk, I will demonstrate this methodology for various domains including spreadsheet macros, database queries, smartphone scripts, and even drawing programs.
In the second half of the talk, I will present surprising applications of this synthesis methodology in the area of intelligent tutoring systems including solution generation, problem generation, automated grading, and even structured content entry. I will demonstrate these applications for various domains including geometry, algebra, automata theory, and introductory programming. The underlying synthesizers leverage search techniques from various communities including use of SAT/SMT solvers (formal methods community), version space algebras (machine learning community), and A-style goal-directed heuristics (AI community).
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.