CRD’s Brian Tierney Uses Expertise to Improve Performance of Networked Systems
January 1, 2004
A good mechanic doesn’t just keep a car running, but has a toolkit of expertise and expert tricks to get the most performance out of a vehicle. And so it is with networks and computers, especially when distributed computing sources arelinked by networks to perform as an integrated system.
Brian Tierney of the Distributed Systems Department has built a reputation as an expert in finding new ways to improve the performance of distributed systems. He’s shared his talents with DARPA, Internet2, CERN and other national labs.
Next up on his agenda is PFDLnet, the Second International Workshop on Protocols for Long- Distance Networks (http://www- didc.lbl.gov/PFLDnet2004/index.htm), which Brian is co-chairing with Les Cottrell of SLAC on February 16-17. At the heart of the workshop is a known problem and four potential solutions.
The problem is with TCP, the transmission control protocol that ensures that data packets are delivered as sent. While TCP is one of the key elements of the Internet’s initial success, it’s now become a speedbump. TCP just doesn’t scale well to work on really fast, long-distance networks – high bandwidth and high latency overwhelm the congestion control algorithm in the protocol. A number of techniques have been developed to deal with the problem, and Tierney himself has created a Web page with 100 tips and tricks for tuning TCP to improve performance (go to http://www-didc.lbl.gov/TCP-tuning/TCP- tuning.html).
“When trying to do Grid troubleshooting, one of the first things people do is blame the net- work – and that’s true less than half the time,” Tierney said. “You can tune the network in terms of TCP, but we’ve exhausted the limits of tuning. For very high-speed networks, a new protocol is really needed.”
The goal of the workshop is to try to come up with a consensus on how to proceed. On the agenda are four approaches, ranging from minor tweaks of the current TCP to a complete overhaul. The tactics range from enhancing the current TCP congestion response algorithms, congestion avoidance- based techniques, building reliability on top of UDP (User Datagram Protocol, or the Unreliable Data Protocol, depending on your perspective) to radically new protocols such as XCP (eXplict control protocol) which requires router support so that the routers inform the sender of the level of congestion at the bottle neck, allowing the sender to back off before congestion (loss) occurs.
While 18 papers will be presented over the two days of the workshop, there will also be plenty of time for discussion, which Tierney said is critical to the success of the workshop.
Tierney stepped forward to co-chair this workshop after attending the first one, and he decided that it was important for the community. “It’s very appropriate that DOE take a leadership role in this work as we have the big applications, and the Science Grid, that TCP is now getting in the way of,” he said.
Contact Brian at BLTierney@lbl.gov.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.