InTheLoop | 11.05.2012
November 5, 2012
Visualizing Gulf of Mexico Oil Dispersion
In April 2010, an explosion at the Deepwater Horizon offshore oil rig caused about 53,000 barrels of crude oil to spew into the Gulf of Mexico every day for nearly three months. At the time, several computer models predicted that ocean currents would eventually carry this oil thousands of miles across the Atlantic seaboard—a scenario that could devastate economies and ecosystems across the Eastern United States.
But when these forecasts didn’t pan out, scientists realized that their computer models were missing some critical information. Using visualization software developed for the fusion energy research community by computer scientists in Berkeley Lab’s Computational Research Division, oceanographers found that they needed to factor in the interactions between deep, middle, and surface ocean currents to successfully track oil dispersion in the Gulf of Mexico. Read more.
DOE’s Investment Ensures AmeriFlux Data for All
Twenty years ago, researchers began installing sensors in a variety of ecosystems to study how carbon dioxide, energy, and water vapor cycle through the environment. Today, these sensors have been deployed at 120 locations across the Americas. Because the Department of Energy recognizes that these datasets could benefit a variety of scientific communities, it is funding an effort to make this data accessible to a wide range of researchers. Read more.
ESnet’s Greg Bell to Give Invited Talk at Canadian Networking Conference
Greg Bell, head of ESnet and director of Berkeley Lab’s Scientific Networking Division, has been invited to present “Network as Instrument: The View from Berkeley” at the 2012 CANARIE Users’ Forum on Wednesday, Nov. 7, in Quebec. Bell was invited to give his presentation in Canada after sharing his ideas in a keynote address at the 2012 NORDUnet conference in Oslo, Norway in September. In the audience was Jim Roche, president and chief executing of CANARIE, the Canadian research network.
In his invitation to Bell, Roche wrote “Your message about how ESnet is an instrument of science and discovery, rather than simply infrastructure, is compelling. I shared your views with my colleagues at CANARIE when I got back to Ottawa. Your presentation would light the spark for discussion on the changing role of CANARIE in Canada’s R&E ecosystem.”
ASCR Discovery: To Rid Water of Salt, MIT Group Taps Thin Carbon and Computing
Guided by advanced molecular modeling on NERSC supercomputers, Massachusetts Institute of Technology scientists are investigating ways to turn atom-thick carbon layers into membranes for a new and improved desalination method in places with inadequate fresh water, according to an article in ASCR Discovery.
“Without any actual experimental demonstration, what our calculations tell us is that the performance of the graphene membrane for water desalination would be very high,” says Jeffrey Grossman, a materials scientist who is MIT’s Carl Richard Soderberg associate professor of power engineering and leader of the investigation. Read more.
David Skinner Co-Edits Special Issue of Computing in Science and Engineering
David Skinner, leader of NERSC’s Outreach, Software, and Programming Group, and NERSC user Massimo Di Pierro of DePaul University are guest editors of the November–December 2012 issue of Computing in Science and Engineering on the topic “Concurrency in Modern Programming Languages.” You can read the guest editors’ introduction here.
This Week’s Computing Sciences Seminars
LAMA for Efficient AMG on Hybrid Clusters
Tuesday, Nov. 6, 2:00–3:00 pm, 50F-1647
Thomas Soddemann, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Germany
LAMA is an open source library which can be used to compose iterative solvers and allows leveraging the performance potential of heterogeneous compute resources. The algebraic multigrid (AMG) algorithm is a hierarchical method for solving problems based on elliptic differential equations.
During the talk I will show how we implemented the solution phase and our setup phase for optimizing use of our heterogeneous hardware setup.
On the Convergence Rate of Symmetric Subspace Iteration: Scientific Computing and Matrix Computations Seminar
Wednesday, Nov. 7, 12:10–1:00 pm, 380 Soda Hall, UC Berkeley
Chris Melgaard, UC Berkeley
Subspace iteration is a classical method for approximating the largest eigenvalues of a matrix in magnitude and the associated invariant subspace. In light of the recent success of randomized algorithms for computing low-rank approximations, we propose a new method for analyzing symmetric subspace iteration with a random start matrix (or start subspace). We will provide deterministic and probabilistic error bounds for the convergence of eigenvalues and low-rank matrix approximations. In the case of quickly decaying eigenvalues, the subspace iteration method efficiently computes a low-rank approximation to the original matrix as shown by our error bounds. We will also discuss how oversampling dramatically increases the quality of our random start matrix.
This is joint work with Ming Gu.
Learning Patterns in Big Data from Small Data using Core-Sets
Thursday, Nov. 8, 12:00–1:00 pm, 254 Sutardja Dai Hall, UC Berkeley
Dan Feldman, Massachusetts Institute of Technology
When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. In recent years we have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster on a streaming input, using a manageable amount of memory, and in parallel (say, using Hadoop, cloud service, or GPUs).
A core-set for a given problem is a semantic compression of its input, in the sense that a solution for the problem with the (small) core-set as input yields an approximate solution to the problem with the original (Big) data. In this talk I will describe the core-set approach and recent algorithmic achievements for computing core-sets with performance guarantees. I will also describe applications of this magical new paradigm in Machine Learning, Robotics, Computer Vision, and Privacy. Finally, I will describe in detail iDiary: a system that turns large sensor signals collected from smart-phones into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., “Where did I buy books?”) and receive textual answers based on their GPS signals.
Dissertation Talk: Streaming Graph Partitioning for Large Distributed Graphs
Thursday, Nov. 8, 4:00–5:00 pm, 380 Soda Hall, UC Berkeley
Isabelle Stanton, UC Berkeley
Graph partitioning is a classic problem in Computer Science with an array of applications from community detection, to image segmentation and reducing communication on cluster systems. As the size of graphs has grown, our interest in these applications has too, yet partitioning remains a difficult problem.
In this talk, I will discuss algorithms for a streaming model for finding balanced partitions in a graph with only one pass. This matches partitioning a graph as it is being streamed onto a cluster, either from disk or from a web crawler. I will discuss some lower bounds on this problem, then experimental results evaluating a the quality of cuts produced by variety of simple heuristics and practical results showing how the best can improve computation time for distributed implementations of PageRank. I will finish with a theoretical analysis of two of the heuristics that clearly demonstrates the difference in their performance.
Link of the Week: Parallella: A Supercomputer for Everyone
To make parallel computing ubiquitous, developers need access to a platform that is affordable, open, and easy to use. The goal of the Parallella project, which was recently funded on Kickstarter, is to provide such a platform. The Parallella platform will be built on three principles: open access, open source, and affordability (a 16-core system will be available for $99).
The Parallella platform is based on the Epiphany multicore chips developed by Adapteva over the last four years and field tested since May 2011, demonstrating 50 Gflops/Watt. The Epiphany chip consists of a scalable array of simple RISC processors programmable in C/C++, connected together with a fast on-chip network within a single shared memory architecture. Read more.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.