Berkeley Lab Scientists Build Software Framework for ATLAS Collaboration
May 17, 2010
Contact: Linda Vu, email@example.com, 510-495-2402
Three thousand researchers in 37 countries are searching for the origins of mass, new dimensions of space and undiscovered forces of physics in the head-on collisions of high-energy protons at the Large Hadron Collider's ATLAS experiment. When ATLAS is turned-on, its detectors record about 400 collision events per second from a variety of perspectives, a rate equivalent to filling 27 compact disks per minute.
In order to sift out signs of new physics in this torrent of data, thousands of researchers must be able to process this information and collaborate on results in real time. To facilitate this distributed workflow, they are relying on a software framework called Athena, which was developed by an international team of scientists led by Paolo Calafiura of the Advanced Computing for Science Department in the Lawrence Berkeley National Laboratory's (Berkeley Lab) Computational Research Division.
"When developers plug their codes into the Athena framework, they get the most common functionality and communication among the different components of the experiment," says Calafiura, who is also the Chief Software Architect for ATLAS.
He notes that the Athena software is essentially the "plumbing" for the international ATLAS collaboration. It allows scientists to focus on developing tools for analysis and actually analyzing data, instead of worrying about infrastructure issues like the compatibility of files. Researchers simply plug their codes into the framework, and Athena takes care of basic functions like coordinating the execution of applications and applying a common application-programming interface (API) so that collaborators can retrieve the same files.
"If you are a researcher that would like to computationally reconstruct the track left by high-energy muon particles while they traverse six different ATLAS detectors, you can use Athena StoreGate library to access the data coming from each detector and later to post the results of your reconstruction code for others to use," says Calafiura. "Once an object is posted to StoreGate, the library manages it according to preset policies and provides an API so that collaborators can easily share the data."
Athena is built on top of the Gaudi software framework that was originally developed for the LHCb experiment, which specifically looks at physics phenomena involving fundamental particles called b-mesons.
"When we began thinking about the software framework for ATLAS, we were very impressed by the Gaudi framework that was developed for LHCb. The architecture was very well designed, and it was simple to use. Rather than reinvent the wheel, we decided to work with what existed and plug in ATLAS-specific enhancements," says Calafiura. He notes that the Gaudi architecture is now a common kernel of software used by many experiments around the world and is co-developed by scientists mainly from ATLAS and LHCb.
Athena and Gaudi were developed in a close collaboration between computer scientists and physicists. Keith Jackson, Charles Leggett, and Wim Lavrijsen of the ACS department worked with Calafiura to develop the Athena framework. The team worked closely with David Quarrie, Mous Tatarkhanov and Yushu Yao, of the Berkeley Lab's Physics Division.
Berkeley Lab and LHC Computing
Beyond developing software for ATLAS, the Berkeley Lab will also contribute computing and network resources to CERN's LHC—the world's largest particle accelerator. On March 30, the experiment achieved record-breaking seven trillion electron volt (7 TeV) proton collisions and opened a new realm of high-energy physics.
The LHC contains six major detector experiments, and experts predict that 15 petabytes of data will flow from the LHC per year for the next 10 to 15 years.That is enough information per year to create a tower of CDs more than twice as high as Mount Everest. Because CERN only has enough computational power to handle about 20 percent of this data, the workload is divided up and distributed to hundreds of universities and institutions around the globe via the LHC Computing Grid. The United States contributes 23 percent of the worldwide computing capacity for the ATLAS experiment and more than 30 percent of the computing power for another LHC experiment called CMS.
Raw data from all six LHC experiments are initially processed, archived and divided for distribution to 12 Tier-1 sites around the globe at CERN's onsite data center. In the US, the Fermi and Brookhaven National Laboratories are Tier-1 facilities that receive large portions of CMS and ATLAS data from CERN for another round of processing and archiving. This data is eventually carried to Tier-2 facilities, which primarily consist of universities and research institutions across the nation, for analysis. The Berkeley Lab's National Energy Research Scientific Computing Center (NERSC) is one of eight Tier-2 centers in the western US that will provide computing and storage resources for the LHC's ATLAS and ALICE experiments data analysis.
As each processing job completes, the sites will push data back up the tier system. In the United States, all data traveling between Tier-1 and Tier-2 DOE sites will be carried by the Energy Sciences Network (ESnet), which is managed by Berkeley Lab's Computational Research Division. ESnet is comprised of two networks—an IP network that carries daily traffic to support lab operations, general science communication and science with relatively small data requirements, and a circuit-oriented Science Data Network (SDN) to transfer massive data sets like those from the LHC. Using the On-Demand Secure Circuit and Advanced Reservation System (OSCARS), researchers can reserve bandwidth on the SDN to guarantee that data is delivered to their collaborators with a certain time-frame.
For more information about Berkeley Lab contributions to LHC computing, please read:
Global Reach: NERSC Helps Manage and Analyze LHC Data
ESnet4 Helps Researchers Seeking the Origins of Matter
About Computing Sciences at Berkeley Lab
The Computing Sciences Area at Lawrence Berkeley National Laboratory(Berkeley Lab) provides the computing and networking resources and expertise critical to advancing Department of Energy Office of Science (DOE-SC) research missions: developing new energy sources, improving energy efficiency, developing new materials, and increasing our understanding of ourselves, our world, and our universe. ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities. NERSC and ESnet are both Department of Energy Office of Science National User Facilities. The Computational Research Division (CRD) conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation.
Berkeley Lab addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science. The DOE Office of Science is the United States' single largest supporter of basic research in the physical sciences and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.