Machine Learning and Artificial Intelligence

Machine learning/AI plays a key role in our work in autonomous scientific discovery. Berkeley Lab’s strong machine learning/AI program helps our researchers use experimental and simulation data to recommend the next set of experiments. For example, active learning plays a large role in steering instruments at large experimental facilities around the world. As noisy data is collected sequentially, it becomes more and more infeasible for a human to make optimal decisions about future measurements. This is especially true in high-dimensional spaces. Autonomous experimentation often produces extremely sparse datasets which make robust uncertainty-quantification methods such as Gaussian processes a popular choice. Ensemble models combined with Bayesian inference is another approach used for recommendation engines. Similarly, automating the process of target selection for molecular synthesis is a critical component of many self-driving labs in biology and chemistry.

Projects

gpCAM for Domain-Aware Autonomous Experimentation

gpCAM is an API and software designed to make autonomous data acquisition and analysis for experiments and simulations faster, simpler, and more widely available. The tool is based on a flexible and powerful Gaussian process regression at the core. Its flexibility stems from the modular design of gpCAM, which allows the user to implement and import their own Python functions to customize and control almost every aspect of the software. That makes it possible to easily tune the algorithm to account for various kinds of physics and other domain knowledge and to identify and find interesting features and function characteristics. A specialized function optimizer in gpCAM can take advantage of HPC architectures for fast analysis time and reactive autonomous data acquisition. Contact: Marcus Noack

Automated Protein Design

Goal-oriented design and optimization of proteins for biomaterials and bioproduction pathways is a grand challenge, the meeting of which would have profound impacts on the bioeconomy. Addressing this challenge requires the convergent development of novel AI methods and advances in protein design, integrated with computing technologies and biological experiments. A central component of the joint Computing Sciences Area and Biosciences Area strategy is to build core computational capabilities for employing AI approaches for protein design and optimization. Researchers in this emerging direction will have a deep understanding and expertise in both protein science and AI. Berkeley Lab can become a leader in this important new field by recruiting and hiring early- and mid-career researchers with these skills. Contact: Kris Bouchard

News

Berkeley Lab’s CAMERA Leads International Effort on Autonomous Scientific Discoveries

July 28, 2021

To make full use of modern instruments and facilities, researchers need new ways to decrease the amount of data required for scientific discovery and address data acquisition rates humans can no longer keep pace with. Read More »