A-Z Index | Directory | Careers

Project Jupyter gets $6M to expand collaborative data science software

July 7, 2015

IMG7757.JPG

Fernando Perez and Brian Granger discuss the architecture of Project Jupyter, as its scope expands to reach data science applications in over 40 programming languages. Photo credit: Adriana Restrepo

PALO ALTO, Calif. July 7, 2015 — Three foundations pledged $6M over the next three years to Project Jupyter, an open-source software project that supports scientific computing and data science across a wide range of programming languages via a large, public, open and inclusive community.

Fernando Perez of University of California, Berkeley and Lawrence Berkeley National Laboratory (Berkeley Lab's) Computational Research Division and Brian Granger of California Polytechnic University, San Luis Obispo will lead the project at their institutions. Perez and Granger’s efforts with Project Jupyter are the result of their work developing IPython, a popular user interface for interactive computing across multiple programming languages.

With this award from the Leona M. and Harry B. Helmsley Charitable Trust, Alfred P. Sloan Foundation, and Gordon and Betty Moore Foundation, these researchers will expand and improve the capabilities of the Jupyter Notebook, a web-based platform that allows scientists, researchers and educators to combine live code, equations, narrative text and rich media into a single, interactive document.

Granger and Perez estimate that well over one million people in fields from biostatistics to astronomy to finance use Jupyter. Currently, the platform is used to analyze massive gene sequencing datasets, process images from the Hubble Space Telescope and develop models of financial markets.

“Project Jupyter serves not only the academic and scientific communities, but also a much broader constituency of data scientists in research, education, industry and journalism,” said Perez. “Given the importance of computing across modern society, we see uses of our tools that range from high school education in programming to the nation’s supercomputing facilities and the leaders of the tech industry.”

Teachers, for example, can prepare a lecture using Jupyter Notebook and then turn it into a web-based slide show presentation in which you can write code and see the results of that code in real time. Students can then use it for homework and reports.

The capabilities of the Jupyter Notebook will expand to allow users easier access to collaborative computing and reusing their content in a wide range of settings, such as standalone web applications and dashboards.

“We are excited by the potential of Project Jupyter to reach even wider audiences and to contribute to increased cross-disciplinary collaboration in the sciences,” said Betsy Fader, director of the Helmsley Charitable Trust’s Biomedical Research Infrastructure Program. “Jupyter Notebook is a tool that embodies the current shift in science towards more reproducible research, which in turn enables more effective science,” said Chris Mentzel, program director at the Gordon and Betty Moore Foundation. “It will enable data exploration, visualization, and analysis in a way that encourages sound science and speeds progress.”

For more information on Project Jupyter, see the project’s website: http://jupyter.org.

RELATED INFORMATION 

###

Project Jupyter has a rich history of an interactive community of researchers that both contribute to and use the software. Jupyter evolved from a project called IPython, an interactive environment for the popular programming language Python.  For the first decade, IPython was part of an ecosystem of open-source projects for scientific computing with the Python language, known as the “SciPy Stack”. Efforts were then expanded beyond Python and the interactive shell evolved into a generic architecture for interactive computing and computational narratives. Project Jupyter has resulted in the development of a significant toolset in data science for both industrial and academic research, and is poised to become an integral part of reproducible data science.

University of California was chartered in 1868, and what would become its flagship campus was soon established at Berkeley. It was born out of a vision in the State Constitution of a university that would "contribute even more than California's gold to the glory and happiness of advancing generations." UC Berkeley claims 22 Nobel Laureates, seven of whom are current faculty members. The campus enrolls more than 36,000 undergraduate and graduate students, and has more than 1,500 full-time and 500 part-time faculty members in more than 130 academic departments and more than 110 interdisciplinary research units and field stations. Today, UC Berkeley is considered one of the nation’s most prestigious universities – public or private – and is internationally recognized for its distinguished record of world-class scholarship, innovation and concern for the betterment of our world.

Cal Poly is a nationally ranked, four-year, comprehensive polytechnic public university located in San Luis Obispo, Calif. It was founded in 1901 and has been part of the renowned California State University system since 1960. Known for its Learn by Doing approach, small class sizes and open access to expert faculty, Cal Poly is a distinctive learning community whose 20,000 academically motivated students enjoy an unrivaled hands-on educational experience that prepares them to lead successful personal and professional lives. 

The Leona M. and Harry B. Helmsley Charitable Trust aspires to improve lives by supporting effective nonprofits in health, place-based initiatives, and education and human services. Since 2008, when the Trust began its active grantmaking, it has committed more than $1 billion to a wide range of charitable organizations. The Trust’s Biomedical Research Infrastructure Program seeks to strengthen the research tools, training and collaborative platforms for the health sciences and enhance the quality and reproducibility of biomedical data and findings. For more information, visit www.helmsleytrust.org.

The Alfred P. Sloan Foundation is a philanthropic, not-for-profit grantmaking institution that supports original research and education in science, technology, engineering, mathematics, and economics. Funds for this project were provided through the Foundation's Digital Information Technology program, which leverages developments in information technology to increase the effectiveness of computational research and scholarly communication. For more information, please visit www.sloan.org.

Gordon and Betty Moore Foundation believes in ideas that create enduring impact in the areas of science, environmental conservation and patient care. Intel co-founder Gordon and his wife Betty established the foundation to create positive change around the world and at home in the San Francisco Bay Area. The Science Program accounts for approximately 40 percent of the foundation’s annual grantmaking, making the foundation one of the largest private funders of science nationwide. Visit moore.org or follow @MooreFound.


About Computing Sciences at Berkeley Lab

High performance computing plays a critical role in scientific discovery. Researchers increasingly rely on advances in computer science, mathematics, computational science, data science, and large-scale computing and networking to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab’s Computing Sciences Area researches, develops, and deploys new foundations, tools, and technologies to meet these needs and to advance research across a broad range of scientific disciplines.