Berkeley Lab Team Develops Flexible Reservation Algorithm for Advance Network Provisioning
December 13, 2010
Media Contact: Jon Bashor, email@example.com, 510-486-5849
Scientific research is becoming increasingly dependent on data-intensive analysis as larger teams of scientists are generating, sharing and analyzing very large datasets. Due to the geographical dispersion of the researchers, networking support is critical to advancing these efforts. Many applications need networking support that provides predictable performance, which in turn requires effective algorithms for bandwidth reservations.
To address this need, the U.S. Department of Energy's Energy Sciences Network (ESnet) developed and deploys a network reservation system called OSCARS (On-Demand Secure Circuits and Advance Reservation System). The system establishes guaranteed bandwidth of secure virtual circuits for a certain bandwidth and length of time. However, OSCARS currently only gives a yes or no response to reservation requests. Users currently cannot inquire about bandwidth availability, nor receive alternative suggestions when reservation requests fail.
To address this, Mehmet Balman, Arie Shoshani and Alex Sim of Berkeley Lab's Scientific Data Management (SDM) Group and Evangelos Chaniotakis of ESnet developed a flexible reservation algorithm for advance network provisioning. A paper describing the work was one of 51 technical papers accepted by the SC10 conference (out of 253 submissions) and was presented by Balman at the conference in New Orleans in November.
The SDM Group develops new methods and tools for moving and analyzing massive scientific datasets, while ESnet is DOE's network connecting more than 40 major research sites and peers with more than 100 other research and commercial networks around the world.
The authors describe the algorithm as "a novel approach for pathfinding in time-dependent networks taking advantage of user-provided parameters of total volume of data to be tranfered and time constraints for moving the data." The algorithm's flexibility can be likened to looking for airline flights. While travelers can usually specify their exact times and dates, many reservation systems can offer a number of options and lower fares if the traveler can be flexible regarding departures and arrivals.
Currently, when a scientist enlists OSCARS, the system checks network availability and capacity for the specified duration of time, and allocates it for the user if it is available. Otherwise, it reports that it is unable to provide the requested allocation. When this happens, it falls upon the user to use a trial-and-error approach to finding an available time for the required bandwidth. The new algorithm presents the user with a variety of possible reservation options and alternatives. For example, if a research team is up against a deadline for presenting findings, they may opt for the earliest completion time of the data transfer. On the other hand, they may be able to wait for a time when the highest bandwidth is available, allowing the data to be transferred in the shortest duration. Users can then choose the option that best fits their needs.
Balman began working on the project as a summer student at Berkeley Lab in the summer of 2009 while he was working on his doctorate at Louisiana State University. He has since graduated and is now a staff member of the Scientific Data Management Group at Berkeley Lab.
"It is an interesting problem that has come about with the development of bandwidth reservation systems such as OSCARS," Balman said. "Although we have implemented our algorithm for testing and incorporation into a future version of OSCARS, the algorithm is not specific to OSCARS, and can be used with any network reservation framework."
In the paper, Balman and his co-authors point to a couple of examples of next generation research networks that will be increasingly important. In the area of high energy physics, the Large Hadron Collider (LHC) in Switzerland is expected to generate 100 gigabits of data per second in the near future. This data is then quickly distributed to tiers of data facilities around the world for analysis by thousands of scientists in dozens of countries. Similarly, in the field of global climate research, the Earth System Grid (ESG) currently contains 35 terabytes of data shared by more than 16,000 users worldwide; while the next generation climate data archive is expected to be hold than 5 petabyte of climate data.
Supporting such data-intensive science requires a communication infrastructure which enables large-scale data replication, high performance remote data analysis and visualization, and provides access to computational resources, in addition to reliably transferring the data across networks.
When they first looked into the issue of flexible bandwidth provisioning, Balman and his co-authors thought the problem would be hard in terms of complexity. But on further examination, they realized that it can be solved in polynomial time, and it can be implemented and integrated into the current network reservation frameworks in a very effective manner. The algorithm produces results in less than a second for current network configurations, and it is quite practical even if applied to future very large networks with hundreds or even thousands of routers and links, a critical factor as nearly all end-to-end network connections traverse multiple links across different networks.
Now that the algorithm has been implemented as a new service extending the current underlying mechanisms, the team is integrating the algorithm into the next version of ESnet OSCARS, due out in early 2011. Additionally, they are working on the coordination of storage and network resource provisioning.
About Computing Sciences at Berkeley Lab
The Computing Sciences Area at Lawrence Berkeley National Laboratory(Berkeley Lab) provides the computing and networking resources and expertise critical to advancing Department of Energy Office of Science (DOE-SC) research missions: developing new energy sources, improving energy efficiency, developing new materials, and increasing our understanding of ourselves, our world, and our universe. ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities. NERSC and ESnet are both Department of Energy Office of Science National User Facilities. The Computational Research Division (CRD) conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation.
Berkeley Lab addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science. The DOE Office of Science is the United States' single largest supporter of basic research in the physical sciences and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.