A-Z Index | Directory | Careers

Interested Staff Invited to Apply to Attend 5-Day Machine Learning Hackathon

April 2-6 Workshop Organized by Computing Sciences, Biosciences

March 1, 2018

Staff in the lab's Computing Sciences and Biosciences areas are invited to participate in a weeklong workshop focusing on machine learning in data science. The goal of the workshop, to be held April 2-6, is to build bridges between Computing Sciences and Biosciences through a common foundation in statistical computing.


The course is open to all staff in Biosciences and Computing Sciences, but prerequisite training in basic Python, basic linear algebra and basic to intermediate statistics is required.  Attendance is limited to 26 participants. All sessions will be held in Bldg. 59, room 4102 at Berkeley Lab.

Course Description

The first three days will consist of training in machine learning led by the Data Incubator, a company known for training scientists for data science careers in industry. Topics covered will include K-nearest neighbors, unsupervised learning, bias, variance and overfitting, scikit-learn workflow, learning and metrics, as well as linear and logistic regression.

In the first portion of the course, students will develop a series of models to predict a venue’s star rating from various features. Working from 100MB of real-world data, they will start with location-based models before building models based on other attributes of the venues. Finally, an ensemble model will blend the individual models into a final prediction of the venue’s popularity

The final two days of the week will be spent applying these new skills to solve problems faced by Berkeley Lab staff. Projects will be proposed by the participants and should be relevant to the mission of Berkeley Lab. One week prior to the hackathon, selected participants will receive a survey with a list of potential projects. Participants will be asked to rank each project and the organizing committee will match participants to projects.

Participants must attend the entire week, participate fully in the hackathon and share their results with the organizing committee. Participants will have the opportunity to present their results to Biosciences and Computing Sciences leadership.

How to apply

To apply, fill out a survey by Friday, March 9, including a one-to-two paragraph statement of why you would like to participate, what you expect to get out of the session, and a short project proposal including links to the data sets that will be used during the hackathon. You must also acknowledge you've met the prerequisites for the course. Successful applicants will be notified by Friday, March 16.

The course is being organized by Kjiersten Fagnan of NERSC/JGI; Andrew Wiedlea, IT Division; Hector Garcia Martin Joint BioEnergy Institute; Mariam Kiran, ESnet; Ben Bowen, Environmental Genomics and Systems Division; and Kris Bouchard, Biological Systems and Engineering Division.

About Computing Sciences at Berkeley Lab

High performance computing plays a critical role in scientific discovery, and researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.

Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 13 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.