Project Jupyter gets $6M to expand collaborative data science software
July 7, 2015
PALO ALTO, Calif. July 7, 2015 — Three foundations pledged $6M over the next three years to Project Jupyter, an open-source software project that supports scientific computing and data science across a wide range of programming languages via a large, public, open and inclusive community.
Fernando Perez of University of California, Berkeley and Lawrence Berkeley National Laboratory (Berkeley Lab's) Computational Research Division and Brian Granger of California Polytechnic University, San Luis Obispo will lead the project at their institutions. Perez and Granger’s efforts with Project Jupyter are the result of their work developing IPython, a popular user interface for interactive computing across multiple programming languages.
With this award from the Leona M. and Harry B. Helmsley Charitable Trust, Alfred P. Sloan Foundation, and Gordon and Betty Moore Foundation, these researchers will expand and improve the capabilities of the Jupyter Notebook, a web-based platform that allows scientists, researchers and educators to combine live code, equations, narrative text and rich media into a single, interactive document.
Granger and Perez estimate that well over one million people in fields from biostatistics to astronomy to finance use Jupyter. Currently, the platform is used to analyze massive gene sequencing datasets, process images from the Hubble Space Telescope and develop models of financial markets.
“Project Jupyter serves not only the academic and scientific communities, but also a much broader constituency of data scientists in research, education, industry and journalism,” said Perez. “Given the importance of computing across modern society, we see uses of our tools that range from high school education in programming to the nation’s supercomputing facilities and the leaders of the tech industry.”
Teachers, for example, can prepare a lecture using Jupyter Notebook and then turn it into a web-based slide show presentation in which you can write code and see the results of that code in real time. Students can then use it for homework and reports.
The capabilities of the Jupyter Notebook will expand to allow users easier access to collaborative computing and reusing their content in a wide range of settings, such as standalone web applications and dashboards.
“We are excited by the potential of Project Jupyter to reach even wider audiences and to contribute to increased cross-disciplinary collaboration in the sciences,” said Betsy Fader, director of the Helmsley Charitable Trust’s Biomedical Research Infrastructure Program. “Jupyter Notebook is a tool that embodies the current shift in science towards more reproducible research, which in turn enables more effective science,” said Chris Mentzel, program director at the Gordon and Betty Moore Foundation. “It will enable data exploration, visualization, and analysis in a way that encourages sound science and speeds progress.”
For more information on Project Jupyter, see the project’s website: http://jupyter.org.
- $6M for UC Berkeley and Cal Poly to Expand and Enhance Open-Source Software for Scientific Computing and Data Science
- Wresting New Tricks From a Python: Fernando Perez Wins 2012 Award for the Advancement of Free Software
- Million Dollar Grant Funds Collaborative Computing
Project Jupyter has a rich history of an interactive community of researchers that both contribute to and use the software. Jupyter evolved from a project called IPython, an interactive environment for the popular programming language Python. For the first decade, IPython was part of an ecosystem of open-source projects for scientific computing with the Python language, known as the “SciPy Stack”. Efforts were then expanded beyond Python and the interactive shell evolved into a generic architecture for interactive computing and computational narratives. Project Jupyter has resulted in the development of a significant toolset in data science for both industrial and academic research, and is poised to become an integral part of reproducible data science.
University of California was chartered in 1868, and what would become its flagship campus was soon established at Berkeley. It was born out of a vision in the State Constitution of a university that would "contribute even more than California's gold to the glory and happiness of advancing generations." UC Berkeley claims 22 Nobel Laureates, seven of whom are current faculty members. The campus enrolls more than 36,000 undergraduate and graduate students, and has more than 1,500 full-time and 500 part-time faculty members in more than 130 academic departments and more than 110 interdisciplinary research units and field stations. Today, UC Berkeley is considered one of the nation’s most prestigious universities – public or private – and is internationally recognized for its distinguished record of world-class scholarship, innovation and concern for the betterment of our world.
Cal Poly is a nationally ranked, four-year, comprehensive polytechnic public university located in San Luis Obispo, Calif. It was founded in 1901 and has been part of the renowned California State University system since 1960. Known for its Learn by Doing approach, small class sizes and open access to expert faculty, Cal Poly is a distinctive learning community whose 20,000 academically motivated students enjoy an unrivaled hands-on educational experience that prepares them to lead successful personal and professional lives.
The Leona M. and Harry B. Helmsley Charitable Trust aspires to improve lives by supporting effective nonprofits in health, place-based initiatives, and education and human services. Since 2008, when the Trust began its active grantmaking, it has committed more than $1 billion to a wide range of charitable organizations. The Trust’s Biomedical Research Infrastructure Program seeks to strengthen the research tools, training and collaborative platforms for the health sciences and enhance the quality and reproducibility of biomedical data and findings. For more information, visit www.helmsleytrust.org.
The Alfred P. Sloan Foundation is a philanthropic, not-for-profit grantmaking institution that supports original research and education in science, technology, engineering, mathematics, and economics. Funds for this project were provided through the Foundation's Digital Information Technology program, which leverages developments in information technology to increase the effectiveness of computational research and scholarly communication. For more information, please visit www.sloan.org.
Gordon and Betty Moore Foundation believes in ideas that create enduring impact in the areas of science, environmental conservation and patient care. Intel co-founder Gordon and his wife Betty established the foundation to create positive change around the world and at home in the San Francisco Bay Area. The Science Program accounts for approximately 40 percent of the foundation’s annual grantmaking, making the foundation one of the largest private funders of science nationwide. Visit moore.org or follow @MooreFound.
About Computing Sciences at Berkeley Lab
The Lawrence Berkeley National Laboratory (Berkeley Lab) Computing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy's research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.
ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 7,000-plus scientists at national laboratories and universities, including those at Berkeley Lab's Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are Department of Energy Office of Science User Facilities.
Lawrence Berkeley National Laboratory addresses the world's most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab's scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.