Kathy Yelick Testifies on 'Big Data Challenges and Advanced Computing Solutions'
July 12, 2018
Contact: John German, firstname.lastname@example.org, +1 510-486-6601
Kathy Yelick, Associate Laboratory Director for Computing Sciences at Berkeley Lab, was one of four witnesses testifying before the U.S. House of Representatives’ Committee on Science, Space and Technology at 7 a.m. PDT / 10 a.m. EDT on Thursday, July 12. The discussion focused on on big-data challenges and advanced computing solutions.
Data-driven scientific discovery is poised to deliver breakthroughs across many disciplines, and the U.S. Department of Energy, through its national laboratories, is well positioned to play a leadership role in this revolution. Driven by DOE innovations in instrumentation and computing, however, the scientific data sets being created are becoming increasingly challenging to sift through and manage.
Big data challenges are often characterized by the 4 Vs: volume (the total size), velocity (the speed at which it is being produced), variability (the diversity of data types) and veracity (noise, errors and other quality issues). Scientific data has all of these, and DOE’s user facilities are a big source of the challenges and opportunities to use large data sets for new discoveries due to increasing data rates, reduced costs of collecting data and total data volumes.
Machine learning represents a promising approach for analytics in science, complementing but not replacing modeling and simulation. In her testimony, Yelick discussed the emerging role of machine-learning methods that have revolutionized the field of artificial intelligence and may similarly impact scientific discovery. She talked about how Berkeley Lab and other national laboratories are applying machine learning tools and techniques to better analyze these data sets and empower scientists to ask and answer increasingly complex questions.
“Machine learning has revolutionized the field of artificial intelligence and it requires three things: Large amounts of data, fast computers and good algorithms," Yelick stated. "DOE has all of these.”
Other key points in her testimony included:
- Examples of large-scale scientific data challenges in the DOE Office of Science, such as analyzing billions of microbes in complicated communities or millions of supernovae millions of light years away
- The unique opportunities for machine learning in science, leveraging DOE’s national role as a leader in high performance computing, applied mathematics, user facilities and interdisciplinary team science
- A vision for the national laboratories that includes foundational research in data science and an interconnected network of experimental and computational facilities to address some of the most challenging data analytics problems in science.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences Area provides the computing and networking resources and expertise critical to advancing Department of Energy Office of Science (DOE-SC) research missions: developing new energy sources, improving energy efficiency, developing new materials, and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). NERSC and ESnet are both Department of Energy Office of Science National User Facilities. The Computational Research Division (CRD) conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are Department of Energy Office of Science User Facilities.
Berkeley Lab addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
The DOE Office of Science is the United States' single largest supporter of basic research in the physical sciences and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.