DSD Tools Making Grids Easier to Use
January 1, 2004
At the SC203 conference held last November in Phoenix, the Grid was omnipresent, with speakers and vendors emphasizing the development and deployment of Grid infrastructure and applications. For all its potential, though, actually using Grids can be difficult and time-consuming.
That’s where Berkeley Lab’s Distributed Systems Department (DSD) (http://dsd.lbl.gov) comes in. At the conference, DSD staff members demonstrated “wxPyGlobusJobGui”, a graphical user interface which integrates functionality from
three DSD projects to provide a securely authenticated, grid-enabled and network-monitored prototype job submission and monitoring system. WxPyGlobusJobGui demonstrated the utility of pyGlobus to rapidly generate a graphical user interface which automated the steps necessary to stage (copy) input and control files from the SC conference exhibition floor to a cluster located at Berkeley Lab, run the AMBER molecular dynamics application on the input files, and stage the result files back to SC03. The graphical user interface demonstrated two other DSD-developed technologies: Akenti and NetLogger.
Akenti (http://dsd.lbl.gov/akenti/) is an access control policy library which uses digitally signed certificates to define access policies for shared resources. Integrating the Akenti access control policy library into the Globus gatekeeper and jobmanager provided an access control mechanism significantly more powerful than the local user/certificate mappings currently used.
NetLogger (http://dsd.lbl.gov/NetLogger) is a lightweight network logging toolkit which facilitates the analysis and debugging of distributed computing environments. Adding NetLogger logging calls to all the elements in the demonstration, from the file staging and job submission components,to the Globus gatekeeper/jobmanager, the Akenti authorization code, and the scientific application, streamlined the process of tracking down problems that occurred during the demonstration, such as invalid input file parameters, authorization failures, and transient problems. NetLogger was quite effective in helping trace concurrent local and remote operations.
By using existing resources such as the Python scripting language, the pyGlobus toolkit (http://dsd.lbl.gov/gtg/projects/pyGlobus/), the NetLogger toolkit, and the wxWindows GUI library, DSD staff members were able to develop the user interface and functional components of the demonstration application extremely rapidly – in less than two weeks. Also, adding NetLogger messages to the pyGlobus file transfer and job submission modules was significantly easier then adding them to the source code of the underlying Globus toolkit.
While demonstrating this work at the conference, staff members learned that there is significant interest in having generic toolkits which can be specialized. Although this demo focused on AMBER, the demo GUI was designed to allow other domain-specific applications to re-use the file staging and job submission functionality.
Already, ideas for improving the application are being discussed. First, it should be possible to describe significantly more sophisticated (and dynamic) data and computational operations (many computational procedures involve the same elements, such a file staging and remote job execution; we will call chains of such elements, constructed to perform common scientific jobs, “workflows”). Second, the application needs a mechanism in which these “workflows” can be visually described, launched, and monitored.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.