Face-to-Face Discussion Helps Fusion Scientists Solve Interface Problem
January 1, 2005
Sometimes, $14 can go a long way. For the price of a train ticket from Manhattan to Princeton, CRD’s Sherry Li was able to meet with scientists at the Princeton Plasma Physics Lab and together they were able to solve problems that were keeping a new fusion code from running fully parallel.
Li, a member of the Scientific Computing Group and one of the key developers of the SuperLU library of solvers, had been consulting with Steve Jardin’s group at PPPL for several months as the fusion researchers worked to develop a
newer, faster version of their legacy code known as M3D.
M3D used the explicit method to solve partial differential equations, an approach that required many small time steps, which took longer to run. The new version, called M3D- C1, uses an implicit scheme with much larger time steps, therefore requireing fewer time steps to the solution. “However, the matrix is much more difficult to solve, and many solvers cannot solve it,” Li said.
Several months ago, the team began using SuperLU as their solver. Although the fusion group computes on Seaborg, the IBM supercomputer at NERSC, they had acquired a 16- processor SGI Altix system to do local development of their new code.
As problems arose, Li and the PPPL team exchanged emails trying to resolve the problems. They even sent their code to her, but she could not find the sticking points just by reviewing it.
“It was getting more complicated – they know the physics part and I know the solver part,” Li said. “We finally decided it might be better to sit down in person and look over the code.”
So, while attending the 16th International Conference on Domain Decomposition Methods at NYU’s Courant Institute in January, Li slipped away for a day and took the train to Princeton.
“They educated me more on how their code worked and we looked at the interface,” she said. “M3D is written in Fortran 90 and my code is in C, so we needed to build some new wrappers.”
In the process of debugging in real time, they were able to identify a word-type inconsistency in the interface that caused the SGI implementation to fail for the largest problem sizes.
“Even though you have been very responsive via email during the last few months, there was really no substitute for your actually being here to witness and diagnose the problems we were having,” Jardin wrote to Li after their meeting. “Thank you so much for making a special trip from your conference to help us debug the implementation of your distributed SuperLU software on our local SGI Altix. This has really made a big impact. As a result of your visit, not only do we understand your SuperLU-dist much better, but we are now able to run our largest jobs in a fully parallel mode, with even better than ‘ideal’ scaling. This will really help us in our code-development activities for the new M3D-C1 code, and will also make our use of NERSC for this code much more productive.”
While the immediate results demonstrate the value of an interpersonal collaborative approach, the success is also an example of how DOE’s SciDAC program is meeting its goal of developing advanced tools though collaboration. The SuperLU development is partly funded by the TOPS SciDAC project led by David Keyes of Columbia University, while M3D-C1 is funded by the fusion CEMM SciDAC project, which is led by Jardin.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.