Before the 2020 image of the coronavirus virion waving its prominent protein spikes was iconic and forever emblazoned on our collective consciousness, scientists were exploring the way those spikes bind protein-to-protein to figure out ways to disrupt that binding and, ultimately, epidemic infection. Now some of those researchers and many others — including epidemiologists and computer scientists — are tapping the HPC expertise and computational might of Department of Energy (DOE) supercomputers to accelerate their research.
"Since early spring, more than 275 scientists and operations staff in Berkeley Lab's user facilities and major programs have been engaged in COVID-19 and SARS-CoV-2 research, much of it with support from the CARES Act through DOE's National Virtual Biotechnology Laboratory (NVBL),” said Jeffrey Neaton, Associate Lab Director for Energy Sciences and Berkeley Lab’s NVBL lead for COVID-19 research.
In April 2020, the National Energy Research Scientific Computing Center (NERSC) joined in the battle against the coronavirus pandemic as a member of the COVID-19 HPC Consortium of technology companies, federal agencies, and other national labs aiming to find innovative solutions to combat COVID-19. Since then, NERSC has allotted 2.5 million node hours on its Cori supercomputer and has provided dedicated HPC staff liaisons and other resources to support COVID-19 research. Of that 2.5 million node hours, 1.4 million has been allocated to Consortium projects and the rest to other COVID-19 focused research.
“NERSC has played a major role in the pandemic response as a partner in the COVID-19 HPC Consortium, providing researchers across the country with access to leading-edge computing resources to fight Covid," said Neaton.
“We’re pleased to have made NERSC’s high performance computing, data systems, and staff expertise available so quickly to researchers to address such an urgent societal need,” said Richard Gerber, NERSC Senior Science Advisor and High Performance Computing Department Head. “We hope that the resources NERSC provides allow scientists to gain new insights into how to prevent, treat, and control the spread of the disease.”
Currently, NERSC is hosting 20 projects covering a wide swath of COVID-19 research, including explorations into FDA-approved drug repurposing, molecular simulations, explorations of resilience, temperature, and humidity effects on SARS-CoV2; epidemic simulation models of the US and other populations; and COVID-19 publications text mining.
Exploring How Protein Spike Binding Hijacks Cells
One project involves a worldwide collaboration led by Professor Wai-Yim Ching at the University of Missouri, Kansas City, that is seeking to understand how the COVID-19 virus enters and infects human cells in unprecedented detail at the molecular and atomic level. Harnessing fundamental theory, advanced algorithms, and Cori’s computing power, the team reported their first results in the August edition of the journal Physical Chemistry Chemical Physics.
In addition to a fatty membrane and the genetic material that takes over a cell, the SARS-CoV-2 virus contains four kinds of protein. The virus’s protein spikes attach to the cell’s ACE2 receptor and the virus enters the cell, injects its deadly cargo, and takes over to replicate itself. The ACE2 receptor — found on the surfaces of the human lung, artery, kidney, and intestinal cells — normally performs several important functions, such as regulating blood pressure. The virus’s spike protein abuses ACE2 in order to merge with the cell’s membrane. It then hijacks the cell, destroys the cell structure, and releases millions of viruses to fuel its rapid replication.
At the tiny scale of large molecules, some hundred thousand times smaller than the diameter of a human hair, quantum mechanics is required to describe the electrons acting as glue to bind the atoms together to form molecules and larger nanostructures. Because so many atoms are involved, this is a challenge for even the most powerful supercomputers. The project team used NERSC’s Cori supercomputer and advanced algorithms to determine the precise locations of the atoms involved in protein-spike binding to ACE2, and improve the precision to less than 0.1 angstrom. The calculations run at NERSC reveal that the correct shape for attaching to ACE2 appears to be determined by a cluster of amino acids, the units that make up the spike protein, with large positive charges. This result provides insight into the infection process at the level of the molecular machinery of cells.
“The complexity of such calculations is unprecedented,” says Professor Ching, “The world-class NERSC facility and staff support is instrumental to my atomic-scale research on the COVID-19 virus.”
This ongoing research seeks to facilitate therapeutic development for COVID-19 by answering pressing questions such as whether dietary selenium deficiency increases death rates. By combining the best experimental microscopy techniques with quantum physics at NERSC, the results so far show positive steps toward stopping viral infection for this and future pandemics.
Creating Novel Treatments and Exploring Bioavailable Approaches with AI
Another project seeks to design peptides and small molecules that will bind to the coronavirus surface proteins to inhibit its binding to human proteins. The research team, a collaboration between Northwestern University, the Beckman Research Institute at City of Hope, and Translational Genomics Research Institute (TGen), has completed initial molecular dynamics simulations of the viral spike protein bound to human ACE2 (see inset) and are using that to inform understanding of the role of glycan molecules. Spike proteins potentially use glycans to evade the human immune system and may play other roles as well.
The team is also conducting a computational search for peptide ligands that bind to the spike protein is being performed using the Rosetta algorithm. The best peptide sequences will then be synthesized for rapid testing of in vitro assays for spike binding. The top peptide candidates will also be tested on SARS-CoV-2 grown on human tissue culture cells and COVID-19 mouse models.
With the awareness that developing novel active therapeutics against coronaviruses such as the one responsible for COVID-19 can be a long, arduous process, another collaboration is looking to uncover existing FDA-approved drugs that could be repurposed to combat COVID-19. Researchers from Harvard University and the Massachusetts Institute of Technology (MIT) are employing 3D machine learning techniques to accelerate the discovery of an existing small molecule that has been tested and is bioavailable. They are using highly efficient electronic structure simulations to quickly calculate molecular conformations and train 3D message-passing neural networks from existing molecular screens against the related SARS-CoV-1 and SARS-CoV-2 data as it becomes available.
Massive Epidemiology Simulations to Inform U.S. Government COVID-19 Policy Decisions
A U.S. Department of Health and Human Services project is performing epidemiology simulations of the entire U.S. population to inform the White House Coronavirus Task Force about the number of COVID-19 cases that are projected to occur in various regions, the need for medical resources, and to understand the impact of social distancing and other interventions. The team is developing county-level mobility and population movement estimates and applying global circulation models, an agent-based epidemiological model used by the Federal Emergency Management Agency.
While individual states can be modeled on standard computing nodes, a model of the entire U.S. - which includes interactions between states - requires much larger computing memory. NERSC worked with the team to run a COVID-19 simulation for the full U.S. population; this simulation will be used to compare and validate the stay-by-state models that are run frequently for different scenarios. Additional full U.S. simulations are planned at NERSC as the situation evolves.
In a related effort, Berkeley Lab’s ExaLearnEpi, part of the Exascale Computing Project, is developing a deep learning surrogate model for an epidemiology code called Corvid, an individual-based model that simulates the spread of SARS-CoV-2 in populations representing communities in the U.S. Together, these research methods serve to inform decision making around the best approach for non-pharmaceutical interventions and suppression efforts.
Contributing to Much More than the Burgeoning Body of COVID-19 Research
The COVID Scholar literature search portal was created in April to help researchers find of-the-moment research results related to the COVID-19 pandemic. Using NERSC’s “Spin” containers-as-a-service platform and expert staff assistance, COVID Scholar, developed by a group of materials science researchers at Berkeley Lab, is powered by natural language processing models running daily on Cori and is having an impact beyond just the search capabilities provided by the portal. COVID Scholar is helping researchers at MIT enable its open access Rapid Reviews: COVID-19 (RR:C19) journal. RR:C19 accelerates peer review of COVID-19-related research preprints to advance new and important findings and prevent the spread of false or misleading scientific news. COVID Scholar methods are also being used to develop representations of genes, proteins, and patient symptoms for integration with the KG-COVID knowledge graph project at Berkeley Lab. The COVID Scholar data stream is also being used by researchers at Pacific Northwest National Laboratory in their own literature analysis tool. To date, COVID Scholar has been used by more than 15,000 researchers and regularly contributes to the work of 500 scientists who use the portal on a regular basis.
Accessing COVID-19 Research Opportunities at NERSC
These are just a handful of examples of the exciting research being supported by NERSC, a DOE Office of Science user facility located at Lawrence Berkeley National Laboratory. Several of the projects are supported by the DOE Office of Science through the National Virtual Biotechnology Laboratory, a consortium of DOE national laboratories focused on the response to COVID-19, with funding provided by the Coronavirus CARES Act. NERSC regularly updates information on COVID-19-related research projects on its website. Beyond the current COVID-19 projects running on Cori, opportunities remain to access high performance computing, data resources, and staff expertise at NERSC for related research. All interested researchers are urged to submit a project proposal to use NERSC through the HPC Consortium’s research portal.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.