A-Z Index | Phone Book | Careers

InTheLoop | 11.24.2014

November 24, 2014

Berkeley Algorithms Help Researchers Understand Dark Energy

To unlock the mystery of dark energy and its influence on the universe, researchers must rely on indirect observations such as measuring how fast cosmic objects, specifically Type Ia supernovae, recede from us as the universe expands. The process of identifying and tracking these objects requires scientists to scrupulously monitor the night sky for slight changes, a task that would be extremely tedious and time-consuming without some novel computational tools developed at the Department of Energy’s NERSC (National Energy Research Scientific Computing Center) and by researchers at Lawrence Berkeley National Laboratory and University of California, Berkeley.  »Read more.

CRD’s Khaled Ibrahim Wins SC14 HPC Challenge with Best FFT Performance

Khaled Ibrahim of CRD’s Future Technologies Group won the HPC Challenge for the fastest performance of a Fast Fourier Transformation (FFT) application at the SC14 conference in New Orleans. Ibrahim tuned his application to achieve 226 teraflop/s running on the Mira IBM BlueGene Q supercomputer at Argonne National Laboratory. His result was 9.7 percent faster than the runner- up, which ran on Japan’s K computer.
The 2014 HPC Challenge Awards Birds-of-a-Feather session was the 10th edition of an award ceremony that seeks high performance results in broad categories taken from the HPC Challenge benchmark as well as elegance and efficiency of parallel programming and execution environments. According to Future Technologies Group Lead Erich Strohmaier, Ibrahim’s FFT performance was the biggest surprise due to its winning margin.

This was Ibrahim's first time entering the challenge. He used a lightweight runtime and algorithmic changes to boost the performance. Ibrahim was also the runner-up for best performance of the high performance LINPACK benchmark and stream benchmark, which tests bandwidth to memory. In addition, he was second runner-up for GUPS, a measurement of the speed of updating randomly generated locations in memory.

ESnet Powers NRL’s 100 Gbps Remote I/O Demo at SC14

The Naval Research Laboratory (NRL), in collaboration with the Department of Energy's Energy Sciences Network (ESnet), the International Center for Advanced Internet Research (iCAIR) at Northwestern University, the Center for Data Intensive Science (CDIS) at the University of Chicago, the Open Cloud Consortium (OCC) and significant industry support, last week conducted a 100 gigabits per second (100 Gbps) remote I/O demonstration at the SC14 supercomputing conference in New Orleans.

The remote I/O demonstration illustrated a pipelined distributed processing framework and software defined networking (SDN) between distant operating locations. The demonstration shows the capability to dynamically deploy a production quality 4K Ultra-High Definition Television (UHDTV) video workflow across a nationally distributed set of storage and computing resources that is relevant to emerging Department of Defense data processing challenges. »Read more.

NERSC wins HPCWire Editors' Choice Award

NERSC was named one of two recipients of the HPCWire "Editors' Choice Award for Best HPC Collaboration Between Government & Industry." Announced at the SuperComputing 2014 conference held in New Orleans last week, the award recognized the center's partnership with Cray and Intel to deploy in 2016 a manycore supercomputer. The Cray system will be called "Cori" in honor of Gerti Cori, the first female American scientist to win a Nobel prize. Lawrence Livermore National Laboratory was the other recipient in this category. »Read more.

Persistence, New Immigration Policy Help Jose Sierra Procure Job in Computing

After an introduction to the Lab’s high-performance computing research through a week-long program for high school students, Jose Sierra hoped to work at the Lab to gain real-world experience. But as an undocumented émigré from Guatemala, he had no work permit. With persistence and help from the Deferred Action for Childhood Arrivals (DACA) policy, Jose changed all that. He’s now a student assistant in CRD. »Read more.

Encouraging Girls of Color to Code

As the Berkeley Lab instructor volunteer with Black Girls Code, Dani Ushizima of CRD helped give girls of color some practical, hands-on experience with coding during the "Build a Web Page in a Day" event held in Oakland earlier this month. Black Girls Code works to empower young women of color to embrace the current tech marketplace as builders and creators in hopes of encouraging more to enter the fields of science, technology, engineering and math (STEM). The next San Francisco Bay Area event will be a Robotics Expo held in Berkeley on Saturday, December 13. »Learn more.

Nature News: Joint Effort Nabs Next Wave of US Supercomputers

Sudip Dosanjh, director of NERSC, and Katie Antypas, head of NERSC's Services Department, were quoted in a November 14 Nature News article that explored how Oak Ridge and Lawrence Livermore National Laboratories are teaming up to specify and purchase next generation supercomputers. The piece was reposted on November 17 by Scientific American. »Read more.

This Week's CS Seminars

»CS Seminars Calendar

Parallel In-Situ Data Processing Techniques

Monday, Nov. 24, 9:30–10:30 a.m., Bldg. 50A, Room 5132
Florin Rusu, University of California, Merced
School of Engineering

Traditional databases incur a significant data-to-query delay due to the requirement to load data inside the system before querying. Since this is not acceptable in many domains generating massive amounts of raw data, e.g., astronomy, genomics, databases are often entirely discarded. External tables, on the other hand, provide instant SQL querying over raw files. Their performance across a query workload is limited, though, by the speed of repeated full scans, tokenizing, and parsing of the entire file.

In this talk, we present SCANRAW, a novel database meta-operator for in-situ processing over raw files that integrates data loading and external tables seamlessly, while preserving their advantages: optimal performance across a query workload and zero time-to-query. We decompose loading and external table processing into atomic stages in order to identify common functionality. We analyze alternative implementations and discuss possible optimizations for each stage. Our major contribution is a parallel super-scalar pipeline implementation that allows SCANRAW to take advantage of the current many- and multi-core processors by overlapping the execution of independent stages. Moreover, SCANRAW overlaps query processing with loading by speculatively using the additional I/O bandwidth arising during the conversion process for storing data into the database, such that subsequent queries execute faster. As a result, SCANRAW makes optimal use of the available system resources—CPU cycles and I/O bandwidth—by switching dynamically between tasks to ensure that optimal performance is achieved.

We implement SCANRAW in GLADE, a state-of-the-art parallel data processing system, and evaluate its performance across a variety of synthetic and real-world datasets. Our results show that SCANRAW with speculative loading achieves optimal performance for a query sequence at any point in the processing. Moreover, SCANRAW maximizes resource utilization for the entire workload execution, while speculatively loading data and without interfering with normal query processing.

A Roofline Model of Energy

Monday, Nov. 24, 2014, 10–11 a.m., Bldg. 50F, Room 1647
Jee Choi, Georgia Institute of Technology

Given a computation, how much time, energy, and power does it require? We describe an energy-based analogue of the time-based roofline model aimed at answering this question. We create this model with the intent not of making exact predictions, but rather, developing high-level analytic insights into the possible relationship among the time, energy, and power costs of an algorithm. Through a series of carefully controlled microbenchmarking studies and measurements on a variety of real-world systems that span a range of compute and power characteristics, we further refine and validate our model. We explore what our model implies about algorithmic time-energy tradeoffs, power constrained computing, abstract architectural "bake-offs," among others.

First, we derive our model and show how we can visualize it using the roofline. We discuss some of the implications of our model for work-communication tradeoff and its impact on performance and energy efficiency. Theoretically, we show that under certain conditions, we can improve both performance and energy efficiency through changes to the algorithm’s intensity. We then describe the process of designing highly tuned microbenchmarks for the purpose of isolating and deriving the energy costs of specific instructions. We use this experimental data to validate and refine our model in order to improve its utility and include the notion of a power cap. Finally, we conduct iso-power comparisons of hypothetical systems for an abstract architectural bake-off

Matrix Computations and Scientific Computing Seminar: An Efficient Algorithm for Computing a Generalized Weaver Partition

Wednesday, Nov. 26, 12:10–1 p.m., 380 Soda Hall, UC Berkeley Campus
Ming Gu, UC Berkeley

In their seminal work in 2013, Marcus, Spielman and Srivastava showed the existence of the generalized weaver partition (GWP). Their work immediately implies that the Kadison-Singer conjecture is true, but leaves the question of computing the GWP unanswered. In this talk, we discuss a close connection between GWP and fundamental problems in numerical linear algebra, and present an efficient algorithm for computing the GWP. We show the correctness of our algorithm under mild conditions and present numerical experimental results that support our claims.

Neyman Seminar: Statistical aspects of population demographic inference from genomic variation data

Wednesday, Nov. 26, 1–2 p.m., 1011 Evans Hall, UC Berkeley Campus

Anand Bhaskar, Stanford University

Genome sequences of present-day individuals are amazingly accurate records of the demographic events that have shaped the history of modern human populations, such as the migration of humans out of Africa, the peopling of different parts of the world, and the explosive population growth in the last few hundred generations of human civilization. Such understanding of population demography, besides being of historical interest, has wide-ranging applications from medical genetics to forensic science.

Since its development in the early 1980's, coalescent theory has emerged as a powerful tool for modeling the evolution of genomic sequences drawn from a population, and much work has been done on using the coalescent to estimate changes in the effective size of a population from genomic variation data. I will describe some of these approaches and focus attention on one of the most widely-used summary statistics of genomic variation data --- the sample frequency spectrum (SFS). The SFS of a sample of sequences counts the number of segregating sites in a sample as a function of the mutant allele frequency, and provides a highly efficient dimensional reduction of a large number of sequences. While the expected SFS of a random sample depends strongly on the underlying population demography, it has been shown that very different population size functions can generate the same expected SFS for arbitrarily large sample sizes, a non-identifiability result that, in principle, poses a thorny challenge to statistical inference. However, these counterexamples are biologically quite unrealistic. We reexamine this problem and show that, under biologically reasonable assumptions, the expected SFS of a sufficiently large sample of sequences uniquely identifies the population demography. We also derive explicit bounds for the sample sizes that are sufficient for identifiability for model families that are commonly used in practice, such as piecewise-constant and piecewise-exponential population size functions. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform.

Link of the Week: Airlines Know that Dirty Birds Don't Impress

Even with all the airplanes SC14 travelers were on last week (and all those planes Thanksgiving travelers will be on this week), most of us give little thought to when (or how) these birds are bathed. Airlines do. A recent article at Travelskills.com surveys how airlines get these big birds clean, including a video of a "special bath" given an Air France A380 luxury liner on its first San Francisco touchdown. »Read more.