On February 11 and 12, 2021, the CSA virtually held its second annual Postdoc Symposium. 20 postdoctoral speakers working at the Lab shared presentations on their research with an audience of peers, mentors, and coworkers. View their individual presentations below.

“Data-Driven Models for Equation-Oriented Optimization”

Integrated Data Frameworks Group

Abstract: Equation-oriented (EO) optimization is a foundational technology widely used in the design and optimization of process and decision systems. However, for large multi-scale systems, the EO-approach is often bottlenecked by the speed and complexity of some of the high-fidelity (HF) and CFD sub-models present in the system. One way to address these challenges is to replace those complex high-fidelity and/or CFD models with accurate, low-fidelity data-driven models. In this talk, we highlight the challenges of using data-driven models within EO-systems and present a framework for addressing these challenges. We also introduce PySMO, a Python tool providing a suite of user-friendly and effective methods for generating data-driven models that interfaces directly with an EO-framework. The approach developed in this research allows us to solve previously intractable problems and to explore rapidly and effectively the design space of potential new technologies.

“Accurate Numerical Algorithm for Scientific Applications with Complex Geometries”

Applied Numerical Algorithms Group

Abstract: Many scientific and engineering applications require to solve partial differential equations (PDEs) in the presence of geometries that are not aligned to a cartesian grid. In contrast to finite element methods, we discretize the PDEs to take into account “cut cells,” where the domain boundaries lead to irregular geometries with different cut-cell volumes/faces. The challenge for numerical algorithms on these grids is both achieving high accuracy (on irregular volumes) and stability, which is challenging with small cut cells. We propose a method that uses small weighted least-squares linear systems to calculate consistent finite volume stencils, in a stable and accurate way that deals with the small cell problem. The method is implemented in Chombo, a software package for higher-order AMR applications, and we demonstrate the stability and accuracy of the method with a few simple tests on model problems.

“Mapping the Universe Using Generative Adversarial Neural Networks”

Computational Cosmology Group

Abstract: To understand properties of our universe and test theories about it origin, it is essential to produce maps of how matter is distributed in the universe. Currently, these maps are obtained by simulating interactions between matter particles. However, these first principles simulations are computationally expensive taking several days of computer time. Machine Learning techniques called Generative models have proven very successful in learning patterns in the original image data and then generating synthetic images that resemble them. In this work, we use a Machine Learning technique called Generative Adversarial neural network to generate matter distributions quickly and efficiently. After training these networks on images obtained from simulations, these networks can be used to produce new maps of the universe that resemble the simulation images, while taking only a fraction of the time. This will enable exploration of a large number of theories and increase the precision of computed parameters.

“Approximate Quantum Circuit Synthesis Using Block Encodings”

Scalable Solvers Group

Abstract: One of the challenges in quantum computing is the synthesis of unitary operators into quantum circuits with polylogarithmic gate complexity. Exact synthesis of generic unitaries requires an exponential number of gates in general. We propose a novel approximate quantum circuit synthesis technique by relaxing the unitary constraints and interchanging them for ancilla qubits via block encodings. This approach combines smaller block encodings, which are easier to synthesize, into quantum circuits for larger operators. Due to the use of block encodings, our technique is not limited to unitary operators and can be applied for the synthesis of arbitrary operators. We show that operators which can be approximated by a canonical polyadic expression with a polylogarithmic number of terms can be synthesized with polylogarithmic gate complexity with respect to the matrix dimension.

“High-Performance Multifrontal Solver with Low-Rank Compression”

Scalable Solvers Group

Abstract: In numerous scientific applications, for example linear and nonlinear elasticity problems or electromagnetic diffusion problems the discretization of the corresponding partial differential equation leads to large sparse linear systems. In this work, we focus on the efficient computation of large sparse linear systems of equations by means of the multifrontal solver, a direct, factorization-based method. On one hand, direct methods are numerically robust, reliable, and easy to use. On the other hand, they are computationally complex. The multifrontal approach divides the large sparse matrix into dense sub-blocks of different sizes, so called frontal matrices. For the discretization of partial differential equations the dense frontal matrices have been shown to possess low-rank blocks. This talk presents a multifrontal solver that leverages hierarchical and non-hierarchical matrices, using low-rank compression methods on the dense frontal matrices, to reduce the cost of sparse direct solvers without sacrificing their robustness, ease of use, and performance. Previously, the hierarchically off-diagonal butterfly (HODBF) format has been leveraged to accelerate compression and factorization in the multifrontal method. HODBF turns out to be very efficient for large-sized fronts. In addition, experiments have shown a reduced flop count for the non-hierarchical Block Low-Rank (BLR) compression of medium size fronts in comparison with hierarchical rank-structured techniques. In this talk, we discuss a hybrid version using the HODBF format in combination with the BLR format. In particular, we apply the HODBF compression to large sized fronts of the multifrontal method and apply the BLR compression to medium sized fronts. This leads to a further reduction of the overall number of operations and the factorization time. We illustrate the robustness and performance of this new preconditioner and apply the factorization as a preconditioner for GMRES and compare with a number of other common preconditioners.

“Light Simulations for Dark Matter”

NERSC

Abstract: Photon propagation is the most time- and resource-intense element of the LZ dark matter detector simulations. Here we present a project to replace the existing CPU-based photon propagation model with a containerised implementation of a GPU-based one based upon Opticks. Opticks is an optical-photon simulation package which is built upon the NVIDIA OptiX ray-tracing engine. In this talk we address the advantages of GPUs in this role, and the benefits and challenges of using containerised solutions in a HPC environment. This implementation runs on CoriGPU at NERSC. Running software of this type on CoriGPU requires Shifter images, in which root access is suppressed and where the image is maintained as write-only. Containerisation allows for multiple configurations of hardware and mandates that data that must be write-enabled be mounted on a writable partition instead of included within the image. By addressing the requirements of portability for CoriGPU, we will also create a solution which is portable to the forthcoming Perlmutter supercomputer. Opticks is a relatively new project and is under active development. This means that it is dependent on specific versions of external packages. We aim to create a truly portable, containerised solution which can support these requirements.

“An AMR Subglacial Hydrology Model – SUHMO”

Applied Numerical Algorithm Group

Abstract: Observations suggest that the water flow under ice sheets and glaciers can have a strong influence on ice dynamics, particularly through changes in basal water pressure. A comprehensive ice-sheet model should include the effect of basal hydrology, but very often its effects are neglected or very crudely approximated. Subglacial hydrology is challenging in part due to the wide separation of spatial and temporal scales between the glacier and the typical subglacial water channel, and also the widely varying spatial scales of the subglacial system itself. To tackle this, we have developed an Adaptive Mesh Refinement (AMR) model based on the Chombo library. Building on the model proposed by Sommers et al., the present model naturally transitions from distributed water drainage to channelized flow, adding fine mesh resolution where needed to accurately capture the local degree of channelization, a key component of the coupling between the ice sheet and the subglacial motion. We present examples demonstrating the effectiveness of our approach.

Scalable Solvers Group

Abstract: Combinatorial optimization problems on graphs are ubiquitous in scientific computing and many of them are known to be in the class of NP problems. Because of the impossibility of finding exact solutions to these problems, many ad hoc heuristic methods have been proposed during the years. In particular, in this talk we focus on the graph partitioning problem, whose heuristic algorithms often show poor scalability properties, and we study a method based on deep reinforcement learning (DRL) to find approximate solutions. The DRL agent is built in a fashion that resembles the popular multigrid algorithms, often used in the existing heuristic techniques. This model is able to provide good quality and balanced partitions on various and diverse types of graphs, like the ones associated with meshes.

“Improving All-to-Many Personalized Communication in Two-Phase I/O”

Scientific Data Management Research

Abstract: As modern parallel computers enter the exascale era, the communication cost for redistributing requests becomes a significant bottleneck in MPI-IO routines. The communication kernel for request redistribution, which has an all-to-many personalized communication pattern for application programs with a large number of noncontiguous requests, plays an essential role in the overall performance. This paper explores the available communication kernels for two-phase I/O communication. We generalize the spread-out algorithm to adapt to the all-to-many communication pattern of two-phase I/O by reducing the communication straggler effect. Communication throttling methods that reduce communication contention for asynchronous MPI implementation are adopted to improve communication performance further.

“Determining the Best Molecular Dynamics Potential for the Job”

NERSC

Abstract: The advancement in computational capability has made it possible to use Molecular Dynamics for a variety of applications. Whereas, previously MD simulations were limited to a few thousand atoms and a couple of nanoseconds, with the development of better atomic potentials and exascale computing facilities, it is now possible to simulate millions of atoms, making it possible to study entire proties and material defects. Some of these simulations are designed to use specific atom interaction potential. However, not all simulations are strongly dependent on a particular potential and can choose between a wide variety available to the user. Judging these potentials based on accuracy is one metric but we must also consider how effectively they utilize the computational resources. Therefore, in my talk, I will demonstrate how we can use roofline analysis to select the best MD potential for a given simulation.

“Quantum Imaginary Time Evolution on the Advanced Quantum Testbed”

Advanced Quantum Testbed

Abstract: Recent progress on quantum hardware has enabled the successful computation of small systems with quantum algorithms. Even though these systems are still easy to simulate with classical computers, they constitute a proof of principle that such simulations are indeed possible. The biggest challenge for running larger systems is the noise present on these devices, giving them the name of Noisy Intermediate-Scale Quantum (NISQ) devices. In this work, we present the Quantum Imaginary Time Evolution (QITE) algorithm implementation on the Advanced Quantum Testbed. QITE is a recently proposed quantum algorithm that relies on precise measurements to converge to a given Hamiltonian’s ground state. Its convergence is particularly sensitive to the noise and highlights the challenges of using NISQ devices. To run this algorithm properly, we have developed a new noise-mitigation technique. Taking advantage of the noise tailoring property of randomized compiling, we can extrapolate the right ground state and simulate three-qubit systems with high precision. Our approach is simple to implement and can also be combined with other noise tailoring methods such as noise extrapolation and post-selection.

“Metagenomic Reads Classification Using Graph Neural Networks”

Performance and Algorithms Research

Abstract: Metagenomics data contain sequencing reads from multiple species and provide critical insights into the anatomy of microbial communities found in different environments. Classifying sequencing reads into different species is a key first step in numerous metagenomic analyses pipelines. Existing tools perform classification based only on the contents of the reads and do not take into account the overlap relationship among reads. As a result, these tools fail to classify a huge portion of the input reads and thereby offer vastly low recall values. In this paper, we present MetaGNN, that not just uses the contents of the reads but also the overlap relationships among reads in the form of the overlap graph. We use graph neural networks (GNN) that use the connectivity information in the overlap graph to classify reads. We use the GNN in a semisupervised manner, by first training the GNN on a portion of the nodes in the overlap graph using the ground truth and then perform classification for the rest of the nodes. In our evaluation, MetaGNN is able to achieve both precision and recall up to 98.9% across three metagenomic datasets with varied number of reads and species whereas other tools could only achieve 50–60% recall.

“Techniques to Optimize Applications for Future Hardware”

CCMC

Abstract: The performance of any scientific application depends on efficient implementations of mathematical libraries on a given system. For example, scientific applications like the plane wave based Density Functional Theory approach for electronic structure calculations use Fourier transforms, dense linear algebra (orthogonalization) and sparse linear algebra (non-local projectors in real space) library calls. While most libraries offer efficient implementations for each specific mathematical kernel, the division of the overall computation into domains prevents the possibility to optimize the performance of the whole task. Optimizations such as improving data locality across memory bound operations like the Fourier transforms and the sparse linear algebra, cannot be done easily. In this work we show that, by expressing the Fourier transform as an operation on high dimensional tensors, optimization opportunities can be readily exposed and exploited. We outline a systematic way of merging the Fourier transforms with the linear algebra computations, reducing data movement to main memory and improving on chip data locality. We show that compared to conventional implementations, significant performance improvements can be obtained on both CPU and GPU systems. In addition, we argue that while the work uses Density Functional Theory on CPU and GPU as an example, the approach of merging the Fourier computation with the linear algebra is generally applicable to other applications and can be important for code optimizations on future hardware systems.

“A New Computational Approach for Modeling Nanoscale Electrokinetic Flows”

Center for Computational Sciences and Engineering

Abstract: Scientists and engineers are developing increasingly complex routes to control nanofluids by manipulating the applied electric fields. Such transport is often characterized by intriguing electrokinetic phenomena such as electroconvection and electroosmosis. In this talk, I will describe a numerical methodology for predicting nanoscale transport of electrolytes driven by electric fields within various metallic and dielectric geometries. In this approach, an immersed boundary particle representation of ions is combined with a continuum description of the solvent. Furthermore, the effect of thermal fluctuations is modeled through a fluctuating hydrodynamics formulation, which is critical to correctly predict the electrolyte dynamics at such small scales. Additional care is also taken to properly capture the interaction of particles with walls through corrections to the ionic mobility. This talk will describe the mathematical and computational approaches required to couple these complex physical phenomena. I will describe simulations of a nanocapacitor modeled by ionic fluids confined between oppositely charged plates along with comparison to theoretical predictions. Finally, I will also describe nonequilibrium simulations of transient and steady electroosmotic flow induced through the application of external electric field on an electrolyte confined between charged walls.

“Mitigating Noise on Quantum Computers”

Computational Chemistry, Materials, & Climate Group

Abstract: Existing quantum computers are not fault tolerant. They are sensitive to noise and their output is affected by errors. Noise and errors represent a significant obstacle in obtaining accurate results from these devices. Several methods have been developed to address various error sources. In this talk I will review the dominant types of errors and present methods to mitigate them. The improvements demonstrate that error mitigation techniques are necessary for extracting reliable results from current and near-term quantum devices.

“Modeling Non-equilibrium Phase Transition in Complex Fluids”

Center for Computational Sciences and Engineering

Abstract: Colloidal suspensions are a type of complex fluids, where microscopically small particles are suspended in a Newtonian solvent. They have broad applications in biological soft matter, industrial coatings, food processing and material engineering. Understanding phase transitions of colloidal suspensions is a key step to understanding how they can be used in these applications. They undergo well-defined phase transitions such as crystallization when changing from liquid-like concentrations to dense concentrations. However, crystallization can fail upon a sufficiently fast quench, and the system will “vitrify”: it is forced out of equilibrium, and it will solidify but retain the amorphous structure of a liquid. This failure to crystallization is called the glass transition, and it cannot be explained by equilibrium thermodynamic theory, thus much effort has been devoted to understanding what type of transition it is and how to unambiguously diagnose and predict the glass transition. Despite so many decades of study, there is still no agreed-upon mechanistic description of the glass transition and it remains as a grand challenge for condensed-matter science. To shed light on the paradigms of the glass transition, we study via large-scale dynamic simulation of a hard-sphere colloidal system, where the glass transition is triggered by a particle-size “quench” method to rapidly increase the particle concentration. By varying “quench depth” and post-quench wait time, we observe the physical aging process of particle dynamics as a signature of an out-of-equilibrium material. Moreover, for the first time we find that structurally colloidal glasses are more than a very dense and viscous liquid; it is characterized by a heterogeneous structure that is not observed in a regular liquid. Such heterogeneous structures serve as a driving force for the non-equilibrium aging process. This mechanistic view can help us better design non-equilibrium materials and deepen our understanding of some biological processes.

“Scalable Computational Modeling of Neutrino Quantum Kinetics in Astrophysics”

Center for Computational Sciences and Engineering

Abstract: Current computational astrophysics studies of core-collapse supernovae or neutron star mergers help us identify the origins of heavy elements in the universe. These are also important events for multi-messenger astronomy, generating electromagnetic radiation, copious neutrinos, and gravitational waves. However, connecting these observables to the nuclear physics at the heart of these events relies on extensive numerical simulations that must include processes spanning a wide range of spatial and temporal scales. Neutrino flavor transformations are particularly important to model accurately because neutrino interactions with matter can depend sensitively on their quantum flavor states. Furthermore, because of neutrino self-interactions, fast instability can emerge under realistic conditions that make modelling the nonlinear evolution of the neutrino quantum flavor state very computationally challenging. I will present a new, scalable, particle-in-cell simulation code for solving the mean-field neutrino quantum kinetic equations in three dimensions with arbitrary angular resolution. I will also discuss how this new code enables us to simulate the neutrino fast flavor instability on current pre-Exascale supercomputers and the wider implications for astrophysics.

“Comparing the PFASST, MGRIT and Parareal Methods”

Scalable Solvers Group

Abstract: As supercomputers continue to increase in core count, certain classical algorithms must be replaced with algorithms that can scale. For solving time-dependent ordinary and partial differential equations, classical methods are highly sequential; solving the spatial portion of the equation is done in parallel whereas the temporal solve is sequential. If many time steps are required, these classical methods can perform poorly on massively parallel machines. Consequently, Parallel-in-time methods, which have been studied since the 1960s, have gained renewed interest. Currently three of the most popular parallel-in-time methods are the Parallel Full Approximation Scheme in Space and Time (PFASST), Multigrid Reduction in Time (MGRIT), and Parareal methods. In this talk, the first comparison of these three methods is presented, where all three methods have been implemented in the LibPFASST package. This comparison looks at both the parallel performance and mathematical convergence of these three methods for a variety of test problems.

“Exascale-Enabled Physical Modeling for Next-Generation Microelectronics”

Center for Computational Science and Engineering

Abstract: As Moore’s law approaches its end, the microelectronics community faces increasing challenges in adapting new materials and technologies to develop next-generation devices. These new materials and technologies include, but not limited to, spintronic technologies using magnetic materials, quantum information processing technologies using superconducting materials, electron spins, and so on. We are developing a new multiscale software framework for physical modeling of wave signals with the flexibility for additional physics coupling, targeted at these new microelectronic devices. We also use cutting-edge exascale-ready software frameworks to accelerate traditional electrodynamical codes on modern GPU/manycore HPC architectures. Our platform will offer an algorithmically agile, open source software framework amenable to customization of existing algorithms and incorporation of new physics. I will demonstrate the model capability with realistic devices such as spintronic devices.

Last edited: March 10, 2025