InTheLoop | 08.04.2014
Former Alvarez Fellows Land Professorships
Two recipients of Berkeley Lab's prestigious Alavarez Fellowship were recently appointed to university faculty positions.
Lin Lin has been appointed assistant professor in the Department of Mathematics at the University of California, Berkeley. Lin, whose UC appointment was effective July 1, 2014, will also be a faculty scientist at Berkeley Lab.
Lin joined the Computational Research Division’s Scientific Computing Group as a Luis W. Alvarez Postdoctoral Fellow in 2011 after earning his Ph.D. in applied and computational mathematics from Princeton University. After completing his fellowship in 2013, he was appointed a research scientist at the lab. His current research focuses on applied and computational mathematics, computational quantum chemistry and materials science.
Didem Unat, of the Computational Research Division’s Future Technologies Group, has been appointed as a professor in the Computer Science and Engineering Department at Koç University in Istanbul, Turkey. She begins her new position in September.
Unat, who earned her Ph.D. from the University of California-San Diego, joined Berkeley Lab in 2012 as an Alvarez Fellow. Her research interest lies primarily in the areas of high performance computing, parallel programming models, compiler analysis and performance modeling. She recently received an Early Adopter grant of the NSF/IEEE-TCPP Curriculum Initiative on Parallel and Distributed Computing. »Read more.
Aug. 15 Deadline for SC14 Visualization and Data Analytics Showcase
The SC14 conference is seeking submissions for the SC14 Visualization and Data Analytics Showcase, which provides a forum for the year's most instrumental movies in HPC. The deadline for submissions has been extended to Friday, August. 15. SC14 will be held Nov. 16-21 in New Orleans.
This year’s showcase has a new format: six finalists will compete for the Best Visualization Award, and each finalist will present his or her movie during a dedicated session at SC14 in a 15-minute presentation. Their movies will be judged based on how their movie illuminates science, by the quality of the movie, and for innovations in the process used for creating the movie. The accepted submissions will appear as short papers on the SC14 Webpage and archive.
Submissions should include a movie (up to 250MB in size) and a short paper (up to 4 pages including references). The short paper should describe the scientific story conveyed by the movie, how the visualization helps scientific discovery, and the "state-of-the-practice" information behind making the movie.
Friday Deadline for Submissions to 2015 Tapia Celebration of Diversity in Computing
The 2015 ACM Richard Tapia Celebration of Diversity in Computing Conference call for participation closes at 11:59 PM Pacific time, this Friday, August 8. All submissions for Bird of a Feather (BoF) Panels, Posters, Workshops and/or Doctoral Consortia are due at that time. This year's conference will be held February 18-21, 2015 in Boston.
The goal of the Tapia Conferences is to bring together undergraduate and graduate students, faculty, researchers, and professionals in computing from all backgrounds and ethnicities to
- Celebrate the diversity that exists in computing;
- Connect with others with common backgrounds, ethnicities, disabilities, and gender so as to create communities that extend beyond the conference;
- Obtain advice from and make contacts with computing leaders in academia and industry;
- Be inspired by great presentations and conversations with leaders with common backgrounds.
This year’s conference theme is "Diversity at Scale" as the Tapia Conference celebrates efforts to move diversity in all aspects of computing beyond conversation and study into full practice and implementation. »Read more.
This Week's CS Seminars
Learning and inference with Statistical Relational Learning: theory and applications
Monday, August 4, 10am - 11:00am, Bldg. 50F, Room 1647
Jaesik Choi, Ulsan National Institute of Science and Technology (UNIST), South Korea
Probabilistic Graphical Models (PGMs) promise to play a prominent role in many complex real-world systems. Statistical Relational Learning (SRL) models scale the representation and learning of PGMs. Answering question using SRLs enables many current and future applications, such as social network analysis, financial market prediction, environmental sensing, and large-scale image analysis. Scaling inference algorithms for large models is a key challenge for scaling up current applications and enabling future ones.
This talk will cover recent advances and developments in SRL models at the Probabilistic Artificial Intelligence Lab. at UNIST. The topics include a linear-time Kalman filtering, variational inference algorithms and state estimations in dynamic relational models.
In the second part, some state-of-the-art applications derived from the machine learning algorithms will be introduced. Examples include automatic detection of suspicious activities in complex systems, human action recognition in videos with a wearable device, face detection in distorted image and big data archiving in spatio-temporal sensor networks.
Big Data Analytics on Financial Data on a Massive Scale
Monday, August 4, 2pm - 3:00pm, Bldg. 50A, Room 5132
Kurt Stockinger, Zurich University of Applied Sciences, Switzerland
The recent financial crisis revealed the unstable nature of the financial system. Even today a comparison of risk exposures between banks is almost impossible since there is no globally agreed standard for modeling financial contracts.
The whole financial system is estimated to consist of billions of financial contracts. To simulate the future cash flows of these contracts and possible future macro-scenarios, typically stress tests and Monte Carlo methods are used. For decent statistical precision, a simulation should contain thousands of scenarios resulting in Petabytes of so-called basic analysis data. Analyzing these tremendous amounts of data requires highly-scalable Big Data analytics.
In this talk we will address the Big Data challenges that arise when designing large-scale massively parallel financial simulations. We give insights into the cloud-based Big Data architecture and sketch scenarios that enable querying the data to show potential weaknesses in the financial system.
Multilevel Programming Paradigm for Exascale Computing; YML-XMP experiments as example of graphs of PGAS-written tasks on supercomputers
Tuesday, August 5, 11am - 11:45am, Bldg. 50F, Room 1647
Serge G. Petiton, Laboratoire d’Informatique Fondamentale de Lille, University of Lille, Sciences and Technologies and Maison de la Simulation, CNRS, Saclay, France
Exascale hypercomputers are expected to have highly hierarchical architectures with nodes composed by processors and accelerators. Methods have to be redesigned and new ones introduced or rehabilitated in terms of communication optimizations and data distribution.
The different programming levels (from clusters of processors loosely connected to tightly connected lot-of-core processors and/or accelerators) will generate new difficult algorithmic issues. New language and framework should be defined and evaluated with respect to modern state-of-the-art of scientific methods. We propose a framework, called YML (yml.prism.uvsq.fr), associated with a multilevel programming paradigm, to explore extreme computing and avoid costly global communications and reductions.
YML with its high level language permits to automate and delegate the managements of dependencies between loosely coupled clusters of processors to a specialized tool which controls the execution of the application. Besides, the tightly coupled processors inside each cluster could be programmed through a PGAS language such as XMP. Thanks to the component-oriented software architecture of YML, it is relatively easy to integrate new components such as numerical libraries, encapsulated XMP programs for lower level of the computer architecture, etc. Each of the components may also use runtime system or tools to use accelerators.
In this talk, we present this multilevel programming paradigm for exascale computing and propose our approach based on YML. We discuss orchestration and scheduling strategies to develop in order to minimize communications and I/O.
We present the Block Gauss-Jordan method to invert dense matrices, and the Multiple Explicitly Restarted Arnoldi Method (MERAM) to compute eigenvalues of sparse matrices as study cases. We also propose experiments using components implemented in XMP and discuss projects using StarPU to address accelerators.
Experimental results are obtained on Japanese K and T2K supercomputers, on the French Grid5000 platform, and on the “Hooper” supercomputer in LBNL.
We conclude, first, on the correctness of this approach and we point out, next, the performances of theese methods on the targeted multi-level parallel architectures in the context of the YML/XMP multi languages integrated framework. On the K computer, we obtained much better results using YML and XMP than only XMP, illustrating the interest of our approach, even if supercomputers scheduler are not yet smarter enough to exploit YML graph analysis.
Big Data, Page Ranking and Epidemic Spread Modeling
Tuesday, August 5, 2014, 11:50am - 12:30pm, Bldg. 50F, Room 1647
Nahid Emad, PRiSM laboratory & Maison de la Simulation, University of Versailles, France
The surge of medical and nutritional data in the field of health requires research of models and methods as well as the development of data analysis tools. The spread of infectious diseases, detection of biomarkers for prognosis and diagnosis of the disease, research of indicators for personalized nutrition and/or medical treatment, are some typical examples of problems to solve. In this talk, we focus on the spread of contagious diseases and show how the eigenvalue equation intervenes in models of infectious disease propagation and could be used as an ally of vaccination campaigns in the actions carried out by health care organizations. The stochastic model based on PageRank allows simulating the epidemic spread, where a PageRank-like infection vector is calculated to help establish efficient vaccination strategy. Due to the size and the particular structure of underlying social networks, this calculation requires considerable computational resources as well as storage means of very large quantity of data and represents a big challenge in high performance computing. The computation methods of PageRank in this context are explored. The experiments take into account very large network of individuals imposing the challenging issue of handling very big graph with complex structure. The computational challenges of some other applications such as identification of biomarkers for prognosis and diagnosis of a disease will also be discussed.
Visualization of Google-Like Search over Data Warehouses
Wednesday, August 6, 11am - 12:00pm, Bldg. 50F, Room 1647
Kurt Stockinger, Zurich University of Applied Sciences, Switzerland
Data warehouses are omnipresent in almost all medium-sized to large enterprises. They integrate data from various divisions and departments and typically give a 360-degree view of a company. For Data Scientists these huge troves of data are sometimes considered to be gold mines that enable fact-based decision-making. However, for non-tech savvy business analysts it is a huge challenge to dig through this data. Often deep knowledge of database query languages such as SQL is required to retrieve the desired information.
In this talk we introduce SODA (Search over Data Warehouse) that enables users to intuitively explore enterprise-scale data warehouses via a Google-like search interface. We will present our experiments using SODA in a data warehouse of major Swiss company and outline the recent additions of a graph-based visualization framework for SODA.
Autotuning Compilers in the Exascale Era
Friday, August 8, 11am - 12:30pm, Bldg. 50A, Room 5132
Mary Hall,School of Computing, University of Utah
Autotuning empirically evaluates a search space of possible implementations of a computation to identify the implementation that best meets its optimization criteria (e.g., performance, power, or both). Autotuning compilers generate this search space of different implementations either automatically or with programmer guidance. This talk will explore the role of compiler technology in achieving very high levels of performance, comparable to what is obtained manually by experts. It will focus on the optimizations required for specific domains: geometric multigrid, stencils, sparse matrix and tensor contraction computations. In an exascale regime where processors and memory systems are anticipated to be heterogeneous and hierarchical, exascale applications will need to partition computation across different processors and manage data placement and data transfer horizontally and vertically in the memory hierarchy. Autotuning is expected to play a crucial role in managing the unprecedented complexity of programming exascale systems.
Link of the Week: Happy Birthday DOE!
On August 4, 1977, President Jimmy Carter signed into law legislation that created the U.S. Department of Energy. One day later, President Carter swore in James Schlesinger, the first U.S. Secretary of Energy.
Celebrate the DOE's 37th birthday by testing your Energy Department knowledge.