New Microbial Metagenome Data Analysis System in Production
May 1, 2006
An experimental metagenomics data management and analysis system co-developed by the Biological Data Management and Technology Center (BDMTC) at Berkeley Lab with the Genome Biology Program and Microbial Ecology Program at DOE’s Joint Genome Institute and released earlier this year is also already helping produce scientific discoveries.
Called IMG/M, the system extends the Integrated Microbial Genomes (IMG) system with the ability to integrate and analyze metagenome data, and has provided immediate support for metagenomics studies at JGI. “IMG/M is the first publicly available metagenome data management and analysis system—the first of its kind,” said Victor Markowitz, head of BDMTC.
“Most research in microbial genome analysis focuses on individual organisms,” said Nikos Kyrpides, head of JGI’s Genome Biology Program. “The application of high throughput sequencing to environmental samples has revealed a new universe of microbial community genomes including mostly microorganisms unknown to science. Microbial community genome analysis, also known as metagenomics, focuses on how the entire community functions together.”
Thus far, IMG/M has been used by Phil Hugenholtz, head of the Microbial Ecology Program, and his colleagues at JGI for completing the analysis of enhanced biological phosphorus removing sludge communities and for studying the metagenomes of several key microbial communities recently sequenced by DOE JGI, including the ligno- cellulose-hydrolyzing communities in termite hindguts.
“IMG/M has proven to be an extremely useful resource and tool for analyzing our metagenomic data,” said Jared R. Leadbetter, associate professor of environmental microbiology at the California Institute of Technology and collaborator on the termite hindgut microbial community for bioenergy project. “Such datasets are large, complex, and potentially unwieldy. Importantly, IMG/M is more than just an excellent tool to analyze data. The manner in which the results of that analysis are organized and made accessible through a user-friendly interface allows the researcher to rapidly move in a number of different intellectual directions. As a result, the user becomes better educated with and gets a real feel for the data in a manner that would not otherwise be possible on such short time scales.”
IMG/M was presented by Markowitz at the recent Keystone Symposium on Microbial Community Genomics in Animals and the Environment, organized by DOE JGI Director Edward M. Rubin and Edward F. DeLong, professor in the Division of Biological Engineering and Department of Civil and Environmental Engineering at the Massachusetts Institute of Technology.
“IMG/M provides an intuitive interface, and nice complement to IMG for comparing gene content and phylogenetic profiles of microbial genomes, and relating them to the large microbial community datasets now accumulating,” DeLong said. “These are great and sorely needed data- exploration tools.”
IMG/M was demonstrated at a workshop on April 1, as part of the DOE JGI First Annual User Meeting. Additionally, BDMTC staff members will be presenting the project this summer at workshops and symposia in England, Brazil and Austria.
“We have an experimental system which has proved to be immediately useful in helping scientists to conduct their studies effectively,” Markowitz said.
IMG/M is accessible to the public at http://img.jgi.doe.gov/m.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are Department of Energy Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.