InTheLoop | 01.26.2015
New NERSC Blog Debuts
NERSC recently debuted its first staff blog. Written by NERSC Senior Science Advisor and User Services Group Lead Richard Gerber, the blog aims to give users an inside look at the day-to-day operations at NERSC with insights directly from the center's staff. One recent entry includes a report on upgrades to Edison's memory. Another offers the outlook on NERSC systems for 2015. There's also an option to subscribe to the blog via RSS. »Read more.
SC15 Workshop Submissions Due February 7
SC15's workshop submissions opened January 1 and are due February 7. This year's SC will include nearly 30 full-day and half-day workshops that complement the overall technical program events, with the goal of expanding the knowledge base of practitioners and researchers in a particular subject area. Proposals will be peer-reviewed academically with a focus on submissions that will inspire deep and interactive dialogue in topics of interest to the HPC community. For more SC15 deadlines, visit the conference's "important dates" page. »Read more.
This Week's CS Seminars
Neyman Seminar: Methods for Quantifying Conflict Casualties in Syria
Monday, Jan. 26, 4-5 p.m., 1011 Evans Hall, UC Berkeley
Rebecca Steorts, Department of Statistics, Carnegie Mellon University
Information about social entities is often spread across multiple large databases, each degraded by noise, and without unique identifiers shared across databases. Record linkage—reconstructing the actual entities and their attributes—is essential to using big data and is challenging not only for inference but also for computation. In this talk, I motivate record linkage by the current conflict in Syria. It has been tremendously well documented, however, we still do not know how many people have been killed from conflict-related violence. We describe a novel approach towards estimating death counts in Syria and challenges that are unique to this database. We first introduce a novel approach to record linkage by discovering a bipartite graph, which links manifest records to a common set of latent entities. Our model quantifies the uncertainty in the inference and propagates this uncertainty into subsequent analyses. We then introduce computational speed-ups to avoid all-to-all record comparisons based upon locality-sensitive hashing from the computer science literature. Finally, we speak to the success and challenges of solving a problem that is at the forefront of national headlines and news.
Spreadsheet Composition for Personalized Data Analysis and Presentation
Tuesday, Jan. 27, 10 – 11 a.m., Bldg. 50B, Room 4205
Massimo Maresca ICSI, UC Berkeley, Univ. of Genoa, Italy, and SpreadSheetSpace
The Internet is a powerful global network infrastructure which provides basic communication services to enable the development of a variety of interaction paradigms. Examples of interaction paradigms are the Web,according to which the users extract contents from Web sites through browsers, Peer to Peer, according to which the users simultaneously play the roles of information users and providers, and the Service Composition paradigm, according to which a set of software applications combine atomic functionalities typically made available in the form of Web Services.
In this talk I will introduce a novel interaction paradigm based on Spreadsheet Composition, i.e., in a few words, a paradigm in which the users interact over a Spreadsheet Space, i.e., a virtual space for spreadsheets. In the same way as the World Wide Web initially extended the hypertext concept over the Internet, the SpreadSheet Space extends the natural linked structure of spreadsheets over the Internet.
Spreadsheet Composition significantly differs from Spreadsheet Sharing, proposed in cloud based spreadsheet environments (e.g., Google Docs and Office 365), as it is asymmetric, private and scalable and enables personalized data analyses and presentations. Asymmetry refers to the source-destination relationship typical of spreadsheet cells, privacy is obtained by leaving the unencrypted data under the user administrative domain, scalability is achieved by supporting webs of spreadsheets in the SpreadSheet Space, and personalized analyses and presentations can be obtained by linking spreadsheets to external data. Spreadsheet Composition can be seen both as an instance of Service Composition, as the interacting spreadsheets play the role of processing entities that execute programs written in the form of spreadsheet formulas, and as an instance of Data Composition, as the interacting spreadsheets can be seen as parts of a live Distributed Spreadsheet located in a heterogeneous computer/storage environment under different administrative domains. Spreadsheet Composition can be used to link spreadsheets under a peer to peer relationship, to distribute information, to collect information, to combine information, to access information exposed by corporate software platforms and ERPs as well as to browse Open Data and Big Data reports. In addition it enables spreadsheet based ecosystems in which spreadsheet users receive/process/combine/provide information. The presentation starts from the basic Service Composition concept and rapidly moves to Spreadsheet Composition. It introduces the basic interaction primitives and the functionalities and presents the architecture of a software platform that supports such interaction primitives and functionalities. It then discusses the distinctive elements of the Spreadsheet Composition paradigm, among which spreadsheet sharing vs. spreadsheet linking, fine vs. coarse grain access to spreadsheet content, information system integration at the desktop, direct vs decoupled database access, local vs. remote data storage, spreadsheet ecosystems, static vs. dynamic open data.
CITRIS Research Exchange: Trends in Materials Science and Engineering (MSE) Education
Wednesday, Jan. 28, 12-1 p.m., Sutardja Dai Hall, 310, Banatao Auditorium
FReza Abbaschian, Dean and Distinguished Professor, University of California Riverside
Materials research over the last three decades has led to significant advances in the manufacturing of new materials with tailored and unique properties for a variety of applications. These advances have come about mostly because of a highly influential National Research Council study titled “Materials Science and Engineering (MSE) for the 1990’s: Maintaining Competitiveness in the Age of Materials,” that defined and delineated MSE around four basic elements of synthesis and processing, structure, properties and performance. In this presentation, an overview of the recent trends in incorporating these four basic elements in MSE education will be given, followed by an example of the MSE program at the University of California, Riverside. The program offers interdisciplinary BS, MS and PhD degrees in MSE. At the undergraduate level, the program integrates with all five engineering departments in the Bourns College of Engineering (BCOE). This new approach allows students to have meaningful interactions with engineers from other engineering disciplines who ultimately are the “end-users” of materials. At the graduate level, the UCR MSE program integrates with other disciplines such as Physics and Chemistry.
Link of the Week: How Much Does the Web Weigh?
The founder of the Internet Archive once put 26,000 pounds of the web into a shipping container. Today it is much, much heavier. »Read more.