Interested Staff Invited to Apply to Attend 5-Day Machine Learning Hackathon
April 2-6 Workshop Organized by Computing Sciences, Biosciences
March 1, 2018
Staff in the lab's Computing Sciences and Biosciences areas are invited to participate in a weeklong workshop focusing on machine learning in data science. The goal of the workshop, to be held April 2-6, is to build bridges between Computing Sciences and Biosciences through a common foundation in statistical computing.
The course is open to all staff in Biosciences and Computing Sciences, but prerequisite training in basic Python, basic linear algebra and basic to intermediate statistics is required. Attendance is limited to 26 participants. All sessions will be held in Bldg. 59, room 4102 at Berkeley Lab.
The first three days will consist of training in machine learning led by the Data Incubator, a company known for training scientists for data science careers in industry. Topics covered will include K-nearest neighbors, unsupervised learning, bias, variance and overfitting, scikit-learn workflow, learning and metrics, as well as linear and logistic regression.
In the first portion of the course, students will develop a series of models to predict a venue’s star rating from various features. Working from 100MB of real-world data, they will start with location-based models before building models based on other attributes of the venues. Finally, an ensemble model will blend the individual models into a final prediction of the venue’s popularity
The final two days of the week will be spent applying these new skills to solve problems faced by Berkeley Lab staff. Projects will be proposed by the participants and should be relevant to the mission of Berkeley Lab. One week prior to the hackathon, selected participants will receive a survey with a list of potential projects. Participants will be asked to rank each project and the organizing committee will match participants to projects.
Participants must attend the entire week, participate fully in the hackathon and share their results with the organizing committee. Participants will have the opportunity to present their results to Biosciences and Computing Sciences leadership.
How to apply
To apply, fill out a survey by Friday, March 9, including a one-to-two paragraph statement of why you would like to participate, what you expect to get out of the session, and a short project proposal including links to the data sets that will be used during the hackathon. You must also acknowledge you've met the prerequisites for the course. Successful applicants will be notified by Friday, March 16.
The course is being organized by Kjiersten Fagnan of NERSC/JGI; Andrew Wiedlea, IT Division; Hector Garcia Martin Joint BioEnergy Institute; Mariam Kiran, ESnet; Ben Bowen, Environmental Genomics and Systems Division; and Kris Bouchard, Biological Systems and Engineering Division.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.