InTheLoop | 05.09.2011
May 9, 2011
FedTech Magazine Looks at Magellan Testbed
The April 2011 issue of FedTech reports on the Magellan project in the article “Private Sky: DOE labs use a test bed to perfect cloud services for high-end research.” Author Dan Tynan describes Magellan as “an experiment in experimentation, if you will, that could ultimately change how science is performed across the globe.”
Because cloud computing environments can be provisioned quickly and accessed from virtually anywhere, this infrastructure could expand access to raw computing power to more scientists when they need it, without massive upfront investments in expensive hardware or queuing up for limited time on the big iron, says Katherine Yelick, division director of NERSC.
“We have more people who want time on our high-performance computers than can actually get it, so for the most part those facilities are reserved for large, high-end scientific problems,” Yelick says. “A centralized resource like the cloud is best suited for researchers with very spiky workloads — say, access to 500 computer nodes for 24 hours, once a month. That can be very expensive to do on your own….”
“To me, what’s important about the cloud is that it lets scientists have access to computing resources when they really need them, instead of parceling them out by a certain number of hours of use per year,” Yelick says. “Hopefully the expertise we gain from Magellan will influence what commercial providers, NERSC and others offer in terms of cloud services for their users.” Read more.
James Demmel Elected to National Academy of Sciences
James W. Demmel, a professor at the University of California, Berkeley, who has a joint appointment in Berkeley Lab’s Computational Research Division, is one of 72 new members elected to the National Academy of Sciences (NAS). Election to the NAS recognizes distinguished and continuing achievements in original research. The May 3 election at the NAS annual meeting brings the total number of active members to 2,113. Read more.
Registration Is Open for SciDAC Tutorials on July 15
The fifth SciDAC Tutorials Day will be held at the Brown Palace in Denver on July 15, the day following the main SciDAC 2011 Conference. Tutorials Day provides open and free tutorials on a wide range of subjects in scientific computing. Topics include:
- Visualization and data analysis techniques
- Optimization of parallel applications and meshes
- Hadoop for scientific applications
- Searching and indexing
Computing Sciences staff who will present tutorials include:
- John Wu: FastBit Searching and Indexing
- Iwona Sakrejda: Hadoop for Scientific Applications
- Hank Childs: Advanced Visualization and Data Analysis with the VisIt Visualization System
- Tony Drummond: DOE Advanced CompuTational Software (ACTS) Collection Tutorial
- Esmond Ng: Towards Optimal Petascale Simulations (TOPS)
Jason Hick to Address Oracle User Group Meeting
Jason Hick, NERSC’s Storage Systems Group Lead, will give a presentation on “Storage Supporting DOE Science” at the Oracle Preservation and Archiving Special Interest Group (PASIG) User Group Meeting, being held May 10-12 at the Oracle Convention Center in Redwood City, California.
This Week’s Computing Sciences Seminars
Managing Data Transfers in Cooperative Clusters with Orchestra
Tuesday, May 10, 11:00 am–12:00 pm, 50F-1647
Mosharaf Chowdhury, University of California, Berkeley
Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this impact, there has been relatively little work on optimizing the performance of these data transfers. We propose a global management architecture and a set of algorithms that improve the transfer times of common communication patterns, such as broadcast and shuffle, and allow one to prioritize a transfer over other transfers belonging to the same application or different applications. A prototype implementation of our solution improves broadcast completion times by up to 4.5X and shuffle times by up to 1.5X compared to the status quo in Hadoop. Furthermore, we show that transfer-level scheduling can reduce the completion time of high-priority transfers by 1.7X.
Building Large Scale Dynamic Computing Infrastructures over Clouds
Wednesday, May 11, 3:00–4:00 pm, 50B-2222
Christine Morin, INIRA, France
The popularity of virtualization has paved the way for the advent of the cloud computing model. In cloud computing infrastructures, providers make use of virtualization technologies to offer flexible, on-demand provisioning of resources to customers. Combining both public and private infrastructures creates so-called hybrid clouds, allowing companies and institutions to manage their computing infrastructures in flexible ways and to dynamically take advantage of externally provided resources. Considering the growing needs for large computation power and the availability of a growing number of clouds distributed over the Internet and across the globe, our work focuses on two principal objectives: 1) leveraging virtualization and multiple cloud computing infrastructures to build distributed large scale computing platforms, 2) developing mechanisms to make these infrastructures more dynamic — thereby offering new ways to exploit the inherent dynamic nature of distributed clouds. First, we present how we build large scale computing infrastructures by harnessing resources from multiple distributed clouds. Then, we describe the different mechanisms we developed to allow efficient inter-cloud live migration, which is a major building block for taking advantage of dynamicity in distributed clouds.
Link of the Week: The Argumentative Theory of Reasoning
Why are humans so amazingly bad at reasoning in some contexts, and so amazingly good in others? In the paper “Why Do Humans Reason? Arguments for an Argumentative Theory” cognitive scientists Hugo Mercier and Dan Sperber contend that reasoning was not designed to pursue the truth; reasoning was designed by evolution to help us win arguments. That’s why they call it The Argumentative Theory of Reasoning. So, as they put it,
The evidence reviewed here shows not only that reasoning falls quite short of reliably delivering rational beliefs and rational decisions. It may even be, in a variety of cases, detrimental to rationality. Reasoning can lead to poor outcomes, not because humans are bad at it, but because they systematically strive for arguments that justify their beliefs or their actions. This explains the confirmation bias, motivated reasoning, and reason-based choice, among other things.
The good news: When people are able to discuss their ideas with other people who disagree with them, then the confirmation biases of the different participants will balance each other out, and the group will be able to focus on the best solution. Thus, reasoning works much better in groups. This observation has major implications for education and politics.
Mercier discusses this theory in a conversation with Edge. Read more.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are Department of Energy Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.