Berkeley Lab Team Achieves 10.6 Gigabits/second Data Throughput in 10-Gigabit Ethernet Test
July 3, 2002
Contact: Jon Bashor, 510-486-5849, firstname.lastname@example.org
BERKELEY, CA – Although there has been a lot of discussion recently about 10-Gigabit Ethernet capability, actually achieving that level of performance in the real world has been difficult. Until now.
Last week, a team from Lawrence Berkeley National Laboratory, which operates some of the world’s most powerful computing, data storage and networking resources for the U.S. Department of Energy, teamed with Force10 Networks (switches), SysKonnect (network interfaces), FineTec Computers (clusters), Quartet Network Storage (on-line storage) and Ixia (line rate monitors) to assemble a demonstration system that runs a true scientific application to produce data on one 11-processor cluster, then sends the resulting data across a 10-Gigabit Ethernet connection to another cluster, where it is rendered for visualization.
Though held in the US, the demo included contributions from Europe – the network interface cards were made by SysKonnect, headquartered in Ettlingen, Germany, and the scientific application used for the demonstration was the Cactus simulation code developed by the Numerical Relativity group led by Ed Seidel at the Albert Einstein Institute/Max Planck Institute in Potsdam, Germany.
The result? The team was we able to sustain 10.6 gigabits/sec aggregated across two 10-gigabit interfaces, 9.8 gigabits per second on one interface and 960 megabits per second on the other. The measurements were taken from fiber optic taps using Ixia 400 performance analyzers with 10 gigabit Ethernet interfaces. A total of 58 Terabytes of data were transferred over 12 hours of pre-demonstration testing and the demo itself.
With the IEEE’s adoption of Standard 802.3ae for 10-Gigabit Ethernet equipment in June, the speed of Ethernet operations has increased by an order of magnitude – at least on paper. But achieving that 10-fold increase in actual Ethernet performance remains a challenge that can be met only with leading-edge equipment and expertise.
The system was built as a prelude to Berkeley Lab’s entry into the High-Performance Bandwidth Challenge at the SC2002 conference of high-performance computing and networking, to be held in November in Baltimore, Maryland. Berkeley Lab teams have won the High-Performance Bandwidth Challenge for two consecutive years. At the SC2001 conference held last November, the LBNL team took top honors, moving data across the network at a sustained rate of 3.3 Gigabits in a live computational steering/visualization demonstration involving the Albert Einstein Institute's "Cactus" simulation code (www.cactuscode.org) and Berkeley Lab’s Visapult parallel visualization system (vis.lbl.gov/RDProjects/visapult/index.html).
The demonstration was originally put together to demonstrate real-world applications of 10-Gig E capability for a conference scheduled for June. However, the conference was delayed and the Berkeley Lab team decided to put on a public demonstration before taking the system apart and returning the loaned equipment to the vendors.
“The demo turned out to really successful. Force 10 loaned us the switches, FineTec donated enough computers to make it interesting and we worked with SysKonnect to get very high performance from their network interfaces,” said network engineer Mike Bennett. “Quartet provided the network storage for storing the data to be visualized and Ixia supplied the monitoring equipment. The result is we proved that 10-Gig E is a reality, not just a bunch of back-of-the-envelope calculations.”
According to Bennett, most demonstrations of 10-Gig E to date have been done to showcase interoperability of components made by different vendors, which is the aim of the IEEE standard. That standard doesn’t mean, however, that a system will achieve peak performance.
“What we are demonstrating is that it does work in the real world,” Bennett said.
John Shalf, a member of the Berkeley Lab Visualization Group, said that 10-Gig E capability is important for scientific applications.
Codes like Cactus can easily consume an entire supercomputer, like the 3,328-processor IBM SP at our National Energy Research Scientific Computing Center, or NERSC. The Cactus team ran the code at NERSC for 1 million CPU-hours, or 14 CPU-years, performing the first-ever simulations of the inspiraling coalescence of two black holes,” Shalf said.
A high-bandwidth connection allows users to keep up with the huge data production rates of such simulations – about a terabyte per time step – and ensure that the code is running properly. Otherwise, mistakes may not be detected until the run is finished – and wasted lots of computer cycles generating bad data.
Remote monitoring and visualization require a system that can provide visualization capability over wide area network connections without compromising interactivity or the simulation performance. The team used Visapult, developed by Wes Bethel of LBNL’s Visualization Group for DOE’s Next Generation Internet/Combustion Corridor project several years ago. Visapult allows users to use a desktop workstation to perform interactive volume visualization of remotely computed datasets without downsampling of the original data. It does so by employing the same massively parallel distributed memory computational model employed by the simulation code in order to keep up with the data production rate of the simulation. It also uses high performance networking in order to distribute its computational pipeline across a WAN so as to provide a remote visualization capability that is decoupled from the cycle time of the simulation code itself.
To achieve the 10.6 gigabits per second performance, George “Chip” Smith of the team had to work with SysKonnect to overcome a problem resulting from running Linux on the clusters. “When you run Linux with the SysKonnect card, the libraries in the kernel for the SysKonnect cards have a default behavior and run with an average line rate of 600-700 megabits per second,” Smith said. “Working with Syskonnect, I was able to change one of the libraries in the kernel and using a recent virtual Ehernet interface module, I was able to get 950 to 1000 megabits off the single interfaces. This enabled us to run this demonstration with one-third fewer machines than it would have without the work on the kernel.”
Bennett said the main obstacle to achieving even better performance wasn’t the lack of bandwidth, but rather the lack of resources, including the number of machines in each cluster.
"One of the most exciting things is that it scales. If we would have had 50 boxes in the cluster, we could have delivered 50 gigabits," Bennett said. "Now that we've done 10 Gig, it's time to start looking at 100."
About Computing Sciences at Berkeley Lab
The Computing Sciences Area at Lawrence Berkeley National Laboratory provides the computing and networking resources and expertise critical to advancing Department of Energy Office of Science research missions: developing new energy sources, improving energy efficiency, developing new materials, and increasing our understanding of ourselves, our world, and our universe.
Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 13 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.