Berkeley Lab Mathematicians Develop Framework for Nanocrystallography Analysis

January 14, 2014

Contact: Jon Bashor, jbashor@lbl.gov, 510-486-5849

130909-special-journal-issue-highlights-new-frontier-x-ray-lasers-xfels-2-990.jpg.jpg

The Atomic, Molecular and Optical Science experimental station (AMO) at SLAC's Linac Coherent Light Source X-ray laser. Science performed with the AMO instrumentation includes fundamental studies of light-matter interactions, time-resolved photoionization, x-ray diffraction of nanocrystals and single shot imaging of non-reproducible objects. (Photo by Brad Plummer, SLAC)

X-ray crystallography, an imaging technique used to study the atomic structure of materials using specialized facilities such as the Advanced Light Source at Lawrence Berkeley National Laboratory, has given scientists new insight into areas ranging from genomics to photosynthesis to bone disease. Researchers can infer the structure of the material by examining how it scatters or refracts the X-rays.

But the study of some materials is beyond the reach of conventional X-ray crystallography. With improvements in light source technology, there is an emerging field of X-ray nanocrystallography, in which crystals less than a micron in size can be studied. Although the field offers great promise for studying macromolecules such as those found in the membranes of cell walls, there are a number of challenges in using the technique.

In a paper published Dec. 16, 2013 in the online edition of the Proceedings of the National Academy of Sciences, Berkeley applied mathematicians Jeffrey Donatelli and James Sethian describe their development of mathematical tools to address some of these issues in an article titled “Algorithmic framework for X-ray nanocrystallographic reconstruction in the presence of the indexing ambiguity.” The work is part of a new project, led by Sethian, known as CAMERA (The Center for Applied Mathematics in Energy Research Applications) to design and apply mathematical solutions to data and imaging problems at Berkeley Lab facilities funded by the Department of Energy’s Office of Basic Energy Sciences.

Whereas conventional crystallography can often be performed with just a single crystal or assembly of larger crystals, which can be mounted on a stage and imaged from all directions to generate a 3D image of the structure, nanocrystallography experiments can typically produce two-dimensional images of several thousands of nanocrystals.

Jeffrey Donatelli

The nanocrystals are imaged as they are ejected by a liquid jet (similar to that used in an ink jet printer). As they tumble from the jet, the nanocrystals are blasted by X-rays, which have to be extremely powerful due to the small size of the crystals. But the X-rays are so powerful that the nanocrystals are often destroyed in the process, so the pulses last just a few femtoseconds to ensure the images can be captured. Because the resulting time window is so small, it’s critical that scientists have the capability to decode the images generated by the experiment and combine them to understand the overall structure of the material being studied.

One of the challenges in determining nanocrystal structure is that a number of parameters, including crystal sizes and orientation, are initially unknown and must be resolved. Adding to the difficulty is the fact that the images are very “noisy,” making it hard to interpret them. A technique known as autoindexing, which is used in standard crystallography, resolves some of the parameters, but in many cases does not yield full orientation of the nanocrystals, which can lead to incomplete data about the structure. The problem is known as “indexing ambiguity.”

Enter Donatelli and Sethian, who hold joint positions in the Berkeley Lab Mathematics Group and the UC Berkeley Department of Mathematics.

“There are several problems caused by the shot-to-shot variability in the crystals, noise and lack of complete orientation information, which makes determining the structure challenging,” Donatelli said. “We developed an algorithmic framework to reduce the variance in the data and resolve the missing information, allowing us to determine the structure.”

James Sethian

Understanding the structure is a multi-step process. First, autoindexing is applied to the two-dimensional diffraction patterns. These patterns show bright spots known as Bragg peaks, which correspond to the varying electron densities of the molecule, allow the researchers to get a 3D map from a 2D image. The second step is to analyze the intensity of signals around the Bragg peaks, which helps determine the size of the crystal. Once the size is determined, scientists can use other methods to determine the orientation of the crystal, which removes the indexing ambiguity.

Because the field of X-ray nanocrystallography is quite new – with facilities coming on line in just the last two or three years – there is a dearth of real data to work with, Sethian said. But with new facilities coming online, there is a need to begin working now on analyzing the resulting data.

So, he and Donatelli used synthesized data based on the known structure of an enzyme called PuuE Allantoinase. They factored in the parameters and image noise levels found in current experiments.

“Simulating the data gave us a controlled environment to work in, allowing us insight into the fundamental problems and how to solve them,” Donatelli said.

Although the framework was developed for nanocrystals, the authors note that many of the key elements can also be applied to larger crystals.

Read the paper at: http://www.pnas.org/content/early/2013/12/13/1321790111.

About Computing Sciences at Berkeley Lab

High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.