FastBit Bitmap Index Wins 2008 R&D 100 Award for Technology Advances
July 9, 2008
Four researchers in the Scientific Data Management (SDM) Group in Berkeley Lab's Computational Research Division (CRD) will be awarded one of the 2008 R&D 100 Awards for developing the FastBit indexing technology. The award, given by R&D Magazine to the 100 top new technologies of the year, will go to Kesheng “John” Wu, the key developer; Arie Shoshani, SDM Group Lead; Ekow Otoo; and former SDM member Kurt Stockinger, now working at Credit Suisse in Zürich, Switzerland.
Three other Berkeley Lab inventions are also included in this year's R&D 100 Awards: the Berkeley Lab PhyloChip, the Biomimetic Search Engine, and the Nanostructured Polymer Electrolyte for Rechargeable Lithium Batteries (read more). All winners of the 2008 awards will receive a plaque at R&D Magazine's formal awards banquet in Chicago on October 16.
“To have technology from our organization recognized is a reflection both of the quality of the scientific work we do as well as the significance of its contribution to society,” said CRD Division Director Horst Simon in congratulating the winners.
FastBit is the fastest indexing technology for accelerating searching operations of massive databases, capable of searching up to 100 times faster than other technologies. FastBit significantly advances the state of the art in searching large datasets. It expands the types of data on which bitmap indexes can be used most efficiently, while at the same time it speeds up search operations on all types of data.
FastBit contains significant innovations that have a broad impact in science, technology, and education. For example, FastBit has improved the speed of drug-discovery software at the University of Hamburg, Germany, and improved the matching between web page content and advertisements at Yahoo! Research. A FastBit-enabled grid-based analysis of high-energy physics data received an award from the 2005 International Supercomputer Conference in Heidelberg, Germany, and the work on network traffic analysis received an honorable mention in the High Performance Analytics Challenge at the Supercomputing 2005 conference in Seattle.
FastBit is the software which incorporates the Word-Aligned Hybrid (WAH) compression method developed and patented by Wu, Shoshani, and Otoo. Originally developed to search data from Department of Energy high energy physics experiments, WAH compresses bitmap indexes, a method of reducing the response time of queries involving common types of conditions in data objects, such as “state = CA” and “age >= 21.” It achieves this by storing certain pre-computed answers as bitmaps. For example, a bitmap index for “state” might have one bitmap for each state in the U.S.
Because computers can manipulate bitmaps efficiently, bitmap indices are efficient in searching for interesting records in large datasets. WAH compression makes the bitmap index optimal in terms of computational complexity. A small number of the most efficient indexing schemes have this optimality property. What makes the new technology unique is that WAH-compressed indexes significantly outperform other schemes in tests.
The open-source FastBit software and more information are available from http://sdm.lbl.gov/fastbit.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.