Berkeley Lab Team Develops Flexible Reservation Algorithm for Advance Network Provisioning
December 13, 2010
by Jon Bashor
Scientific research is becoming increasingly dependent on data-intensive analysis as larger teams of scientists are generating, sharing and analyzing very large datasets. Due to the geographical dispersion of the researchers, networking support is critical to advancing these efforts. Many applications need networking support that provides predictable performance, which in turn requires effective algorithms for bandwidth reservations.
To address this need, the U.S. Department of Energy's Energy Sciences Network (ESnet) developed and deploys a network reservation system called OSCARS (On-Demand Secure Circuits and Advance Reservation System). The system establishes guaranteed bandwidth of secure virtual circuits for a certain bandwidth and length of time. However, OSCARS currently only gives a yes or no response to reservation requests. Users currently cannot inquire about bandwidth availability, nor receive alternative suggestions when reservation requests fail.
To address this, Mehmet Balman, Arie Shoshani and Alex Sim of Berkeley Lab's Scientific Data Management (SDM) Group and Evangelos Chaniotakis of ESnet developed a flexible reservation algorithm for advance network provisioning. A paper describing the work was one of 51 technical papers accepted by the SC10 conference (out of 253 submissions) and was presented by Balman at the conference in New Orleans in November.
The SDM Group develops new methods and tools for moving and analyzing massive scientific datasets, while ESnet is DOE's network connecting more than 40 major research sites and peers with more than 100 other research and commercial networks around the world.
The authors describe the algorithm as "a novel approach for pathfinding in time-dependent networks taking advantage of user-provided parameters of total volume of data to be tranfered and time constraints for moving the data." The algorithm's flexibility can be likened to looking for airline flights. While travelers can usually specify their exact times and dates, many reservation systems can offer a number of options and lower fares if the traveler can be flexible regarding departures and arrivals.
Currently, when a scientist enlists OSCARS, the system checks network availability and capacity for the specified duration of time, and allocates it for the user if it is available. Otherwise, it reports that it is unable to provide the requested allocation. When this happens, it falls upon the user to use a trial-and-error approach to finding an available time for the required bandwidth. The new algorithm presents the user with a variety of possible reservation options and alternatives. For example, if a research team is up against a deadline for presenting findings, they may opt for the earliest completion time of the data transfer. On the other hand, they may be able to wait for a time when the highest bandwidth is available, allowing the data to be transferred in the shortest duration. Users can then choose the option that best fits their needs.
Balman began working on the project as a summer student at Berkeley Lab in the summer of 2009 while he was working on his doctorate at Louisiana State University. He has since graduated and is now a staff member of the Scientific Data Management Group at Berkeley Lab.
"It is an interesting problem that has come about with the development of bandwidth reservation systems such as OSCARS," Balman said. "Although we have implemented our algorithm for testing and incorporation into a future version of OSCARS, the algorithm is not specific to OSCARS, and can be used with any network reservation framework."
In the paper, Balman and his co-authors point to a couple of examples of next generation research networks that will be increasingly important. In the area of high energy physics, the Large Hadron Collider (LHC) in Switzerland is expected to generate 100 gigabits of data per second in the near future. This data is then quickly distributed to tiers of data facilities around the world for analysis by thousands of scientists in dozens of countries. Similarly, in the field of global climate research, the Earth System Grid (ESG) currently contains 35 terabytes of data shared by more than 16,000 users worldwide; while the next generation climate data archive is expected to be hold than 5 petabyte of climate data.
Supporting such data-intensive science requires a communication infrastructure which enables large-scale data replication, high performance remote data analysis and visualization, and provides access to computational resources, in addition to reliably transferring the data across networks.
When they first looked into the issue of flexible bandwidth provisioning, Balman and his co-authors thought the problem would be hard in terms of complexity. But on further examination, they realized that it can be solved in polynomial time, and it can be implemented and integrated into the current network reservation frameworks in a very effective manner. The algorithm produces results in less than a second for current network configurations, and it is quite practical even if applied to future very large networks with hundreds or even thousands of routers and links, a critical factor as nearly all end-to-end network connections traverse multiple links across different networks.
Now that the algorithm has been implemented as a new service extending the current underlying mechanisms, the team is integrating the algorithm into the next version of ESnet OSCARS, due out in early 2011. Additionally, they are working on the coordination of storage and network resource provisioning.
About Computing Sciences at Berkeley Lab
High performance computing plays a critical role in scientific discovery, and researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.
Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 13 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.