Cybersecurity, Differential Privacy

Data useful to science is not shared as much as it should or could be, particularly when that data contains sensitivities of some kind. There are many reasons why data may not be shared, including laws and regulations related to personal privacy or national security, or because data is considered a proprietary trade secret. Two drivers for this reluctance to share, which are duals of each other, are concerns of data owners about the risks of sharing sensitive data and concerns of providers of computing systems about the risks of hosting such data. As barriers to data sharing are imposed, data-driven results are hindered because data is not made available and used in ways that maximize its value.

And yet, as emphasized widely in scientific communities, finding ways to make sensitive data available is vital for advancing scientific discovery and public policy. When data is not shared, certain research may be prevented entirely, be significantly more costly, take much longer, or might simply not be as accurate because it is based on smaller, potentially more biased datasets.

Berkeley Lab leverages cybersecurity approaches and privacy-preserving methods as enabling approaches for usable and useful data science by altering the trust relationships required for secure, scientific data management.

Projects

Privacy-Preserving Data Analysis for Scientific Discovery

This project aims to develop a method of leveraging a variety of hardware and software approaches, in concert with privacy-preserving technologies such as differential privacy, for the scientific analysis of sensitive data. The goal is to provide significantly greater confidence to the owner of a set of sensitive data that the data will not be exposed, and also reduce the liability exposure of the data center to assertions of security negligence or insider attacks by providing an environment in which even they cannot access the raw data - all without significant negative impacts to usability or performance. Contact: Sean Peisert (Peisert on the Web)

Provable Anonymization of Grid Data for Cyberattack Detection

Data is frequently not shared by organizations because that data is considered by the organization to be in some way sensitive. This project aims to develop techniques for enabling data analysis for the purposes of detecting and/or investigating cyberattacks against energy delivery systems while also preserving aspects of key confidentiality elements within the underlying raw data being analyzed. The result will be a solution for anonymization of data collected from OT and IT networks pertaining to energy grid cyberattack detection that has been tested for its ability to retain privacy properties and still enable attack detection. Contact: Sean Peisert (Peisert on the Web)

Securing Automated, Adaptive Learning-Driven Cyber-Physical System Processes

Numerous DOE-relevant processes are becoming automated and adaptive, using machine learning techniques. Such processes include vehicle and traffic navigation guidance, intelligent transportation systems, adaptive control of grid-attached equipment, and large scientific instruments. This creates a vulnerability for a cyber attacker to sabotage processes through tainted training data or specially crafted inputs. Consequences might be tainted manufactured output, traffic collisions, power outages, or damage to scientific instruments or experiments. This project is developing secure machine learning methods that will enable safer operation of automated, adaptive, learning-driven “cyber-physical system” processes. Contact: Sean Peisert (Peisert on the Web)

Trusted CI — the National Science Foundation Cybersecurity of Excellence

The mission of Trusted CI — the National Science Foundation (NSF) Cybersecurity of Excellence — is to improve the cybersecurity of NSF computational science and engineering projects, while allowing those projects to focus on their science endeavors. As the National Science Foundation Cybersecurity Center of Excellence, Trusted CI draws on expertise from multiple internationally recognized institutions, including Indiana University, the University of Illinois, the University of Wisconsin-Madison, the Pittsburgh Supercomputing Center, and Berkeley Lab. Drawing on this expertise, Trusted CI collaborates with NSF-funded research organizations to focus on addressing the unique cybersecurity challenges faced by such entities. Contact: Sean Peisert (Peisert on the Web)

Supervisory Parameter Adjustment for Distribution Energy Storage (SPADES)

The Supervisory Parameter Adjustment for Distribution Energy Storage (SPADES) project will develop the methodology and tools allowing Energy Storage Systems (ESS) to automatically reconfigure themselves to counteract cyberattacks against both the ESS control system directly and indirectly through the electric distribution grid. The reinforcement learning defensive algorithms will be integrated into the National Rural Electric Cooperative Association Open Modeling Framework, thereby allowing defensive strategies to be tailored on a utility specific basis. The major outcomes of this project will be the tools to isolate the component of the ESS control system that has been compromised during a cyberattack as well as policies for changing the control parameters of ESS to mitigate a wide variety of cyberattacks on both the ESS device itself and the electric distribution grid. Contact: Daniel Arnold (Arnold on the Web)

Securing Solar for the Grid (S2G)

Berkeley Lab is leading two working groups relating to cybersecurity issues in inverter-based resources (IBR) and distributed energy resources (DER). The first working group is examining cybersecurity issues in AI-based automation for IBR/DER. Automation has brought significant advantages in the power grid for ensuring stability, increasing efficiency, and even providing cybersecurity benefits. At the same time, automation significantly increases cybersecurity risks because automated systems can be remotely attackable, and have similar vulnerabilities to other types of computing systems. The second working group is also data confidentiality issues for IBR/DER. There are many data privacy and confidentiality issues that arise when data is shared, but at the same time, data sharing is essential to planning, research, and efficient operation. Understanding the intersection of confidentiality concerns and the role of privacy-preserving methods might enable both properties. Contact: Sean Peisert (Peisert on the Web)

Cybersecurity via Inverter-Grid Automatic Reconfiguration (CIGAR)

The Cybersecurity via Inverter-Grid Automatic Reconfiguration (CIGAR) project developed supervisory control algorithms to counteract cyber-physical attacks that have compromised multiple independent systems in the electric grid. The project utilized reinforcement learning techniques to simultaneously develop defense strategies in higher dimensions tailored to specific sections of the electric grid. Analysis of derived attack and defensive strategies highlight specific system vulnerabilities as well as determine recommended upgrades to enhance system cyber security. Contact: Sean Peisert (Peisert on the Web)

Medical Science DMZ

A Science DMZ is a portion of the network, built at or near the local network perimeter of an individual research institution, that is designed such that the equipment, configuration, and security poli- cies are optimized for high-performance workflows and large datasets. The traditional Science DMZ model is not currently employed in environments subject to the HIPAA Security Rule and HITECH requirements, due to the presumed technical controls based on de facto use of stateful and deep packet–inspecting commercial firewalls. The Medical Science DMZ is reengineered for “restricted data” as an approach that allows data flows at scale while simultaneously addressing the HIPAA Security Rule and related regulations governing sensitive data and appropriately managing risk. Contact: Sean Peisert (Peisert on the Web)

Cyber Security of Power Distribution Systems by Detecting Differences Between Real-time Micro-Synchrophasor Measurements and Cyber-Reported SCADA

The power distribution grid, like many cyber physical systems, was developed with careful consideration for safe operation, but, a number of features of the power system make it particularly vulnerable to cyber attacks via IP networks. The goal of this project was to design and implement a measurement network that can detect and report the resultant impact of cyber security attacks on the distribution system network. The result is a system that provides an independent, integrated picture of the distribution grid’s physical state, which is difficult for a cyber-attacker to subvert using data-spoofing techniques. Contact: Sean Peisert (Peisert on the Web)

A Mathematical and Data-Driven Approach to Intrusion Detection for High-Performance Computing

The overall goals of this project were to develop mathematical and statistical methods to detect intrusions of high-performance computing systems. Our mathematical analysis was predicated on special characteristics of HPC systems than can be exploited to detect misuse or fraud. In this research work, we employed real system data, which we obtained in collaboration with staff in the NERSC Division of LBNL. Contact: Sean Peisert (Peisert on the Web)

Outbound Connection Detection Project

The Outbound Connection Detection (OCD) project is interested in anomaly detection for hosts that establish new connections that are not part of their historical fingerprint. This is a collaborative project with both High Touch and the ESnet Security group. The framework will be designed to work with a variety of data streams that provide the necessary connection information from ESnet hosts. For HT, they provide full resolution data (unsampled and with nano-second precision) for traffic flows. For the security team, we can use Zeek connection data to build a chronological fingerprint of each host ESnet is interested in protecting. Unseen new events can be quickly filtered through the built models to determine how 'risky' the event is. The OCD fingerprints are an analytical manifestation of the underlying event streams produced by the host. These fingerprints can be used to cluster hosts and discover patterns of similar behavior as well as classify new hosts' traffic. If at some point the project has enough verified instances, we can start using supervised ML techniques to help create models and compare them to our domain expert driven rules. Contact: Mike Dopheide

Networking Operational Security

The ESnet security group has its operational heart-beat centered on data science. They build alerts, reports and dashboards on multivariate data collected from a legion of sources. By focusing on observability — where you learn what's important rather than subscribe it — the security team can discover anomalous patterns as well as investigate specific issues. As services migrate to automation, containers and the cloud, the ability to put data science in automation is becoming an equally important task for security. Contact: Adam Slagell

News

Sean Peisert Tapped to Take on Deputy Director Role at Trusted CI

June 28, 2022

Sean Peisert has been tapped to serve as deputy director on the leadership team of Trusted CI, the NSF Cybersecurity Center of Excellence. Read More »

CIGAR 'Smokes Out' Attacks on Solar Electrical Power Equipment

June 7, 2021

While the need for security in the power grid is clear, cybersecurity has typically been “bolted on” in a piecemeal fashion after the fact, rather than designed in from the outset. Enter the Cybersecurity via Inverter-Grid Automatic Reconfiguration (CIGAR) project, a recently completed Berkeley Lab effort aimed at providing security protections for emerging power systems. Read More »

Summer Students Tackle COVID-19

October 21, 2020

As a part of the Computational Research Division’s summer student program at Lawrence Berkeley National Laboratory, four graduate students from the University of California, Davis (UC Davis) researched a method to allow doctors and researchers to use valuable health information in the battle against COVID-19 while also preserving patient privacy in electronic records. Read More »

Berkeley Lab Cybersecurity Specialist Highlights Data Sharing Benefits, Challenges at NAS Meeting

December 4, 2018

Sean Peisert, chief cybersecurity researcher at Lawrence Berkeley National Lab, recently gave an invited talk on the challenges of data sharing in biomedical science at a meeting of the Committee on Science, Engineering, Medicine, and Public Policy, a joint unit of the National Academy of Sciences, National Academy of Engineering, and the National Academy of Medicine. Read More »

Berkeley Lab Researchers Contribute to Making Blockchains Even More Robust

January 30, 2018

In the last few years, researchers at Berkeley Lab, UC Davis, and the University of Stavanger in Norway have developed a new protocol, called BChain, which makes private blockchain even more robust. The researchers are also working with colleagues at Berkeley Lab and beyond to adapt this tool to support applications that are of strategic importance to the Department of Energy’s Office of Science. Read More »

Combination of Old and New Yields Novel Power Grid Cybersecurity Tool

March 7, 2018

An R&D project led by Berkeley Lab researchers that combines cybersecurity, machine learning algorithms and commercially available power system sensor technology to better protect the electric power grid has sparked interest from U.S. utilities, power companies and government officials. Read More »