A-Z Index | Phone Book | Careers

DOE JGI Releases a New Version of its Metagenome Data Management & Analysis System

February 7, 2008

For more information, contact
David Gilbert
DOE JGI Public Affairs Manager
(925) 296-5643

WALNUT CREEK, CA--Targeting its ever-expanding user community, the U.S. Department of Energy Joint Genome Institute (DOE JGI) has released an upgraded version of the IMG/M metagenome data management and analysis system, accessible to the public at http://img.jgi.doe.gov/m.

IMG/M provides tools for analyzing the functional capability of microbial communities based on their metagenome DNA sequence in the context of reference isolate genomes. The new version of IMG/M includes five additional metagenome datasets generated from microbial community samples that were the subject of recently published studies. These include the metagenomic and functional analysis of termite hindgut microbiota (Nature 450, 560-565, 22 November 2007) and the single cell genetic analysis of TM7, a rare and uncultivated microbe from the human mouth (PNAS, July 17, 2007, vol. 104, no. 29, 11889-11894).

"IMG/M is a fantastic tool that is incredibly helpful in understanding our data," said Stephen Quake, Co-Chair, Department of Bioengineering at Stanford University, Investigator, Howard Hughes Medical Institute, and senior author on the PNAS study. "We used IMG/M in numerous ways, both to analyze our data and to understand general properties of other relevant bacterial genomes. I look forward to analyzing our new datasets with IMG/M."

IMG/M will be demonstrated at a workshop on March 26 as part of the DOE JGI Third Annual User Meeting. IMG/M contains all isolate genomes in version 2.4 of DOE JGI’s Integrated Microbial Genomes (IMG) system, which represents an increase of 1,339 reference genomes from the previous version of IMG/M. Now, IMG/M contains 2,953 isolate genomes consisting of 819 bacterial, 50 archaeal, 40 eukaryotic genomes, and 2,044 viruses.

IMG/M provides new tools for analyzing metagenome datasets in the context of reference isolate genomes, such as the Reference Genome Context Viewer and Protein Recruitment Plot that allow the examination of metagenomes in the context of individual reference isolate genomes. New Abundance Comparison and Functional Category Comparison tools enable pairwise function analysis (COG, Pfam, Enzyme, TIGRfam) and functional category (e.g., COG category) abundance comparisons, respectively, between a metagenome dataset and one or several reference metagenomes or genomes, and test whether the differences in abundance are statistically significant.

IMG/M has been developed jointly by the DOE JGI’s Genome Biology Program (GBP) and Lawrence Berkeley National Laboratory (LBNL) Biological Data Management and Technology Center (BDMTC). The large-scale pairwise gene similarity computations for all the genomes included in IMG/M have been carried out using ScalaBLAST by the Computational Biology and Bioinformatics Group of the Computational Sciences and Mathematics Division at Pacific Northwest National Laboratory, using the William R. Wiley Environmental Molecular Sciences Laboratory (EMSL) Molecular Sciences Computing Facility supercomputer.

The U.S. Department of Energy Joint Genome Institute, supported by the DOE Office of Science, unites the expertise of five national laboratories -- Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, and Pacific Northwest -- along with the Stanford Human Genome Center to advance genomics in support of the DOE missions related to clean energy generation and environmental characterization and cleanup. DOE JGI’s Walnut Creek, CA, Production Genomics Facility provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges.

About Computing Sciences at Berkeley Lab

The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.

ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.

Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.