InTheLoop | 04.29.2013
ESnet and Partners to Create First 100 Gbps Research Link Across Atlantic
Six of the world’s leading research and education networks — ESnet, Internet2, NORDUnet, SURFnet, CANARIE and GÉANT — have announced their intent to build the world’s first 100 gigabits-per-second (Gbps) intercontinental transmission links for research and education. The project, called the “Advanced North Atlantic 100G Pilot” or ANA-100G, is aimed at stimulating the market for 100 Gbps intercontinental networking and advancing global networks and applications to benefit research and education.
“We believe scientific progress should be unconstrained by network capacity or geography — by the location of instruments, data, or people,” said ESnet Director Greg Bell. “This exciting pilot project is an important step in making that vision a reality, especially for research collaborations that span the Atlantic.” Read more.
NERSC Managers Shed Light on Edison in Q&A
The Department of Energy's National Energy Research Scientific Computing (NERSC) Center accepted the first phase of its new Cray Cascade system, named Edison. To find out the reasoning behind the design and deployment of Edison and what it means to NERSC's 4,500 users, Jon Bashor of Berkeley Lab Computing Sciences spoke with NERSC Division Director Sudip Dosanjh, NERSC Systems Department Head Jeff Broughton and Advanced Technologies Group Leader Nick Wright. Read more.
At 14, SuperLU Solver Library Still Growing in Popularity
Since its launch in 1999, the SuperLU software library for solving sparse linear systems of equations has become the third most downloaded software at Berkeley Lab. Between Oct. 1, 2011 and Sept. 30, 2012, SuperLU was downloaded 24,303 times, nearly a 50 percent increase over the 16,876 downloads the previous year. Development of SuperLU is led by CRD’s Sherry Li. Read more.
ESnet Co-Sponsors “Enlighten Your Research Global” Competition
Due to its increasingly collaborative nature, large-scale science is rapidly becoming completely dependent on a complex ecosystem of global research and education (R&E) networks. This is because unique, geographically dispersed scientific instruments and facilities need to be accessed and used remotely by thousands of researchers worldwide. Furthermore, these facilities create massive data sets — which for some experiments can reach terabytes to hundreds of petabytes — that have to be archived, catalogued, and analyzed using large scale computing resources. The ability to reliably manage the round-the-clock data flows from these supercomputers and experimental facilities is essential for scientists to conduct research in today’s research environment.
To ensure that researchers — no matter where they are located — can meet the time-critical needs of their research in this new paradigm, five National Research and Education Networks (NRENs) — ESnet, Funet, Internet2, Janet, and SURFnet — are bringing their resources together to jumpstart a new “Enlighten Your Research Global” initiative.
This brand new competition is soliciting researchers and collaborations that could see immediate benefit to their scientific output through significantly augmented networking and computing resources that they may not have traditionally had access to.
Through a two-step but lightweight proposal process, researchers are asked to share details about their experiment or collaboration, their research goals, and the networking resources they may require to improve their research. In consultation with the five participating NRENs, researchers will work to develop a plan for creating a robust solution to connect their collaborators, experiment, and other resources on an international scale. The NRENS in return will develop a plan for supporting these needs, including those that can be provided at no cost to the research collaboration. Go here for submission information.
Supercomputers Help a Catalyst Reach Its Full Potential
Proton delivery and removal determines if a well-studied catalyst takes its highly productive form or twists into a less useful structure, according to scientists at Pacific Northwest National Laboratory. Using computing resources at NERSC and other DOE centers, the scientists found that the most productive structure for a well-known nickel-based electrocatalyst has key nitrogen-hydrogen bonds close to the nickel center. In this form, called endo/endo, the reaction occurs in a fraction of a second. If the catalyst is in any other form, the reaction takes days to complete. Read more.
Open Networking Foundation Names ESnet’s Inder Monga As One of 12 Research Associates
The Open Networking Foundation (ONF), a non-profit organization dedicated to promoting Software-Defined Networking (SDN), has announced the appointment of 12 Research Associates to the organization. ESnet Chief Technologist Inder Monga is one of nine new industry thought leaders named as a Research Associate for the coming year.
“This is a great honor and also very fitting recognition of Inder’s leadership and expertise in this developing field of SDN and the use of OpenFlow,” said ESnet Director Greg Bell. “He is only one of two Research Associates not working on research sponsored by a university, which also reflects ESnet’s contributions to high performance networking research.” Read more.
Kathy Yelick Invited to Speak at Ginormous Systems Conference
Systems are expanding — from big to ginormous. They collect, analyze, represent, and model wide swaths of data. They help us understand and decide in near real time and are being adopted across myriad organizations because they are cheap and powerful. But complex systems can break in complex ways. Modern hardware and software have made it virtually impossible to identify flaws and vulnerabilities in systems and to ensure that they are secure and trustworthy. We seem to be creating systems and applications in which no one is in charge; how should we think about and plan for this?
These are some of the issues to be addressed at the TTI/Vanguard Ginormous Systems Conference being held April 30–May 1 in Washington, DC. Associate Lab Director Kathy Yelick has been invited to speak on “HPC Meets Big Data: Everything Old Is New Again.” Her talk will discuss some of the common (and distinct) challenges between data analysis and simulation, between science and commercial applications, and between cloud and HPC systems. She will also describe how these models will combine to transform computationally driven design and engineering.
Greg Bell’s “Network As Discovery Instrument” Talk on YouTube
ESnet Director Greg Bell gave a talk on “Network As Discovery Instrument: A Quick-Start Guide” at the DOE Joint Genome Institute’s eighth annual Genomics of Energy & Environment Meeting, which was held March 26–28 in Walnut Creek, CA. A YouTube video of the talk is now available here. You can also view it on SciVee.
David Camp to Present Paper at Eurographics Symposium
David Camp of the Visualization Group in CRD will present a paper at the Eurographics Symposium on Parallel Graphics and Visualization (EGPGV 2013), being held May 4–5 in Girona, Spain. His paper is titled “GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting.” Co-authors from LBNL are Harinarayan Krishnan, Wes Bethel, Kenneth Joy, and Hank Childs.
Symposium on Visions of the Theory of Computing May 29–31
The Simons Institute for the Theory of Computing invites you to attend their Symposium on Visions of the Theory of Computing on May 29–31 in Berdahl Auditorium, Stanley Hall, UC Berkeley. This three-day event will bring together distinguished speakers and participants from the Bay Area and all over the world to celebrate both the excitement of fundamental research on the Theory of Computing, and the accomplishments and promise of computational research in effecting progress in other sciences — the two pillars of the Institute’s research agenda.
There is no registration fee, but for planning purposes registration is required by Friday, May 17. Go here for more information and to register.
This symposium immediately precedes the 45th ACM Symposium on the Theory of Computing (STOC 2013), which runs June 1–4, 2013 in Palo Alto, California.
This Week’s Computing Sciences Seminars
Dissertation Talk: Learning from Subsampled Data: Active and Randomized Strategies
Tuesday, April 30, 2:00–3:00 pm, 380 Soda Hall, UC Berkeley
Fabian Wauthier, UC Berkeley
In modern machine learning applications, we frequently encounter situations where data is either time-consuming or expensive to acquire. In these cases we may have to be satisfied with learning from a small subsample of data. Depending on the circumstances, subsampling can be active (i.e. active learning) or randomized. In this talk I will show work on both fronts.
Active Learning: As models become more complex, active learning tends to become harder. For example, consider complex Bayesian models, which often rely on an MCMC-based method for inference. Here, active learning is commonly thought to be computationally infeasible, since a naive scoring implementation would require running many additional MCMC chains. I propose a new MCMC-based framework for tractable approximate active learning which reuses samples from an existing MCMC chain for approximate scoring. This avoids running extra MCMC chains and outperforms the naive approach.
Randomized Subsampling: A fundamental theoretical question related to randomized subsampling looks at the sample complexity of a statistical model. I will focus on two simple randomized algorithms for ranking from pairwise comparisons and will show sample complexities that in expectation achieve a corresponding lower bound. Additionally, the algorithms possess interesting recovery properties: One algorithm recovers the rank with uniform quality across a permutation, while the other recovers the rank more accurately near the top than the bottom.
DREAM Seminar: Going beyond the FPGA with Spacetime
Tuesday, April 30, 4:10–5:00 pm, 490H Cory Hall, UC Berkeley
Steve Teig, President and Chief Technology Officer, Tabula
The idea of dynamically reconfiguring programmable devices fascinated Turing in the 1930’s. In the early 90’s, DeHon pioneered dynamic reconfiguration within FPGAs, but neither his nor numerous subsequent efforts, both academic and industrial, resulted in a useful and usable product. Over the last several years, we have significantly advanced the hardware, architecture, and software for rapidly reconfiguring, programmable logic: going beyond the FPGA using a body of technology we call “Spacetime”. Spacetime represents two spatial dimensions and one time dimension as a unified 3D framework: a powerful simplification that has enabled us to deliver in production a new category of programmable devices (“3PLDs”) that are far denser, faster, and more capable than FPGAs yet still accompanied by software that automatically maps traditional RTL onto these exotic fabrics. In developing Spacetime, we encountered and resolved many complex, technical challenges that any dense and high-performance reconfigurable device must face, many of which seem never even to have been identified, much less addressed, by any prior effort. In this talk, I will identify some key limitations of FPGAs, introduce Spacetime as a means of addressing them, enumerate some of the many challenges we faced, and present solutions to a couple of them.
A Framework for Low-Communication 1-D FFT: Scientific Computing and Matrix Computations Seminar
Wednesday, May 1, 12:10–1:00 pm, 380 Soda Hall, UC Berkeley
Peter Tang, Intel Corporation
In state-of-the-art high-performance computing on distributed-memory systems, communication often represents a significant part of the overall execution time, and quite likely consumes a major share of the total energy used. For distributed 1-D FFT, every industry-standard implementation performs three all-to-all internode data exchanges which make up the bulk of communication. We present here a mathematical framework for deriving a family of easy-to-implement single-all-to-all 1-D FFT algorithms. Furthermore, our framework allows tradeoff between accuracy and performance. Depending on the problem size and the computer system used, implementations at comparable accuracy based on our new approach can outperform leading FFT libraries by as much as twofold, higher still if reduced accuracy is acceptable.
The Materials Project: Combining Density Functional Theory Calculations with Supercomputing Centers for New Materials Discovery
Thursday, May 2, 12:00–1:30 pm, OSF 943-238
Anubhav Jain, LBNL/EETD
New materials can potentially reduce the cost and improve the efficiency of solar photovoltaics, batteries, and catalysts, leading to broad societal impact. This talk describes a computational approach to materials design in which density functional theory (DFT) calculations are performed over very large computing resources. Because DFT calculations accurately predict many properties of new materials, this approach can screen tens of thousands of potential materials in short time frames.
We present some major software development efforts that generated over 10 million CPU-hours worth of materials information in the span of a few months using NERSC clusters. For the effort, we designed a custom workflow software using Python and MongoDB. This represents one of the largest materials data sets ever computed, and the results are compiled on a public web site (The Materials Project) with over 3,000 registered users that are designing new materials with computed information.
Finally, we describe future efforts in which algorithms might “self-learn” which chemical spaces are the most promising for investigation based on the results of previous computations, with application to solar water splitting materials.
Link of the Week: The End of the Web, Search, and Computer As We Know It
David Gelernter, a professor of computer science at Yale University who predicted the World Wide Web, says he’s often asked what the next web will be like. His answer: there won’t be a next web. The space-based web we currently have will gradually be replaced by a time-based worldstream.
It’s already happening in the form of blog posts and RSS feeds, Twitter and other chatstreams, and Facebook walls and timelines. Its structure represents a shift beyond the “flatland known as the desktop” (where our interfaces ignored the temporal dimension) towards streams, which flow and can therefore serve as a concrete representation of time. Read more.