CRD and Energy Technologies Area Staff and Korean Collaborators Win Best Paper Award
A team of researchers from CRD’s Scientific Data Management Group and the Energy Technologies Area with collaborators in South Korea have won the Best Paper Award at 2015 International Conference on Big Data Intelligence and Computing (DataCom 2015) to be held Dec. 19-21 in Chengdu, China. The paper, "Extracting Baseline Electricity Usage Using Gradient Tree Boosting," was written by Taehoon Kim, who was at Berkeley Lab as a summer student; Dongeun Lee and Jaesik Choi; all of the Statistical Artificial Intelligence Lab at UNIST in Korea and research affiliates with CRD; Anna Spurlock, an environmental economist in the Energy Efficiency Standards Group and the Electricity Markets and Policy Group in ETA; Alex Sim of the Scientific Data Management Group (SDM); Annika Todd, an experimental and behavioral economist in ETA who conducts research and analysis on energy efficiency, demand response and smart grid topics; and SDM Group Leader Kesheng John Wu.
Berkeley Lab, NERSC Researchers Explore Frontiers of Deep Learning for Science
Deep learning is not a new concept in academic circles or behind the scenes at “Big Data” companies like Google and Facebook, where algorithms for automated pattern recognition are a fundamental part of the infrastructure. But when it comes to applying these same tools to the extra-large scientific datasets that pass through the supercomputers at the National Energy Research Scientific Computing Center (NERSC) on a daily basis, it’s a different story.
Now a collaborative effort at Berkeley Lab is working to change this scenario by applying deep learning software tools developed for high performance computing environments to a number of "grand challenge: science problems running computations at NERSC and other supercomputing facilities. »Read more.
ESnet's Greg Bell Talks with HPCwire about What Makes a Great Network
Intrigued by the growing role of networking as an instrument of discovery, Tiffany Trader of the online newsletter HPCwire recently interviewed ESnet Director Greg Bell about what it takes to operate DOE’s international network.
In the article, Bell talks about the range of expertise ESnet’s staff brings to the table: “The network engineers on call 24-7, a cybersecurity team, storage experts, data collection and data analysis activities, and efforts engaged in building out the network. There is a team of people who build software tools to help the network be less of a black box. Then there is another team focused just on science engagement, helping scientists make the best possible use of the network and raising expectations about the network capabilities.”
Get a Move On and Make it to the CS Moving Party on Friday, Dec. 18
Friday, Dec. 18, is moving day for CS staff who will be occupying Wang Hall. After you’ve packed up that last box, take a break and wander over to Wang Hall for a snack, some pre-holiday cheer and your last chance to see the building before it fills up. All CS staff are invited so wander over and partake of some refreshments in the main conference room from 2:30 to 4 p.m.
Link of the Week: The Perfect Holiday Gift for that Dog Lover Who's Really into the Soviet Space Program
In 1957, a dog found wandering the streets of Moscow became the first Earth-born creature in space. Although Laika died during her journey, she helped launch a program in which eight other Soviet dogs rocketed to worldwide fame. Their images adorned toys, postcards, candy packages, books and more. A new book, Soviet Space Dogs, highlights these early stars of space travel with a collection of great images.
This Week's CS Seminars
PATHA: Performance Analysis Tool for HPC Applications
Wednesday, December 16, 2015, 11 a.m.-12 p.m., Bldg. 50F, Room 1647
Wucherl (William) Yoo, Scientific Data Management Group, Berkeley Lab
Large science projects rely on complex workflows to analyze terabytes or petabytes of data. These jobs are often running over thousands of CPU cores and simultaneously performing data accesses, data movements, and computation. It is difficult to identify bottlenecks or to debug the performance issues in these large workflows. To address these challenges, we have developed Performance Analysis Tool for HPC Applications (PATHA) using the state-of-art open source big data processing tools. Our framework can ingest system logs to extract key performance measures, and apply the most sophisticated statistical tools and data mining methods on the performance data. It utilizes an efficient data processing engine to allow users to interactively analyze a large amount of different types of logs and measurements. To illustrate the functionality of PATHA, we conduct a case study on the workflows from an astronomy project known as the Palomar Transient Factory (PTF). Our study processed 1.6 TB of system logs collected on the NERSC supercomputer Edison. Using PATHA, we were able to identify performance bottlenecks, which reside in three tasks of PTF workflow with the dependency on the density of celestial objects.