Two Lawrence Berkeley National Laboratory (Berkeley Lab) scientists, Gunther Weber and Oliver Rübel, along with their international collaborator, Hamish Carr from the University of Leeds, were recently honored with a best paper award by the IEEE Large Scale Data Analysis and Visualization (LDAV) symposium. Their paper, “Distributed Hierarchical Contour Trees,” discusses their development of a powerful tool for data analysis. Their distributed data structure allows scientists and researchers to capture and describe complex changes in the topology of the data and solves the problem of scalability and the roadblock of computing the entire contour tree on a single machine with insufficient memory to store it.

“We implemented a distributed algorithm for computing a hierarchical contour tree with good scaling efficiency and significantly improved performance over the existing state of the art,” said Gunther Weber, staff scientist in the Scientific Data Division. “The work is not over, as effective use of the contour tree for analytic purposes requires further computations, such as geometric measures and branch decompositions. We expect to publish further results on these tasks in the future, together with application studies of contour tree analysis at scale.”

Topological analysis helps comprehend data from numerical simulations. Some simulations produce so much data that creating visual representations of that data would produce something too busy and noisy. The team’s topological analysis tool automatically helps to comprehend the data and simplify the distributed data structure to present more local and global variations. Much like a topographic map, simulations produce high points and low points and create what is called a contour tree, a significant tool for data analysis showing the contours or superarcs of the visualization data involving the contours that cross boundaries between blocks. The tool also limits the communication cost for contour tree computation to the complexity of the block boundaries rather than the entire data set.

“Designing a novel algorithm to efficiently use modern parallel compute architectures (e.g., multi-core CPUs and many-core GPUs) while minimizing communication cost across compute nodes is essential to enable users to effectively utilize modern supercomputer resources at the Exascale” said Oliver Ruebel, staff scientist in the Scientific Data Division.

This research was supported by the Exascale Computing Project, a collaborative effort of the U.S. Department of Energy (DOE) Office of Science and the National Nuclear Security Administration. This research also used computing resources at the National Energy Research Scientific Computing Center (NERSC).

About Computing Sciences at Berkeley Lab

High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab's Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.





Last edited: August 20, 2025