A Holistic Approach to Distribution Grid Intrusion Detection Systems
- Posted on July 18, 2018
- 345 views
As modern power grids tend towards greater levels of automation and communication, the concern surrounding vulnerability to cyberattack is one that is increasingly demanding attention. Today’s power system has evolved to form the foundational bedrock of modern society, and an attack on this infrastructure could prove disastrous. In this article we discuss an approach to marry the underlying physical properties of power systems with the network communications used by power systems in order to offer insights unattainable by either data stream in isolation. While the concept of intrusion detection systems (IDS) is well understood for monitoring network traffic and traditional computing systems, the approach discussed herein is motivated by the idea that while high-frequency physical grid measurements and supervisory control and data acquisition (SCADA) communication over Internet Protocol (IP) networks are fundamentally disparate sources information, when examined through appropriate lenses, they offer a much more nuanced depiction of the grid. This concept follows our earlier work of leveraging insights regarding the physical behavior and limitations of network-connected control systems to detect cyberattacks .
Lawrence Berkeley National Laboratory (LBNL), in collaboration with Arizona State University (ASU), the Electric Power Research Institute (EPRI), EnerNex, Power Standards Lab (PSL), and its utility partners, have been investigating such an approach whereby physical grid measurements are analyzed in conjunction with SCADA captured traffic in order to determine whether an adversary may have infiltrated a utility’s SCADA system. Within the scope of this work, we have focused on using distribution phasor measurement unit (PMU) data with high output sample rate alongside SCADA traffic to examine the state of the distribution grid. The PMUs and computing infrastructure that analyze this data would ideally be isolated from the potentially infiltrated SCADA network local to distribution systems, and would offer system operators and operations engineers a redundant and independent view of the network.
The proposed approach can be broadly classified into two parallel detection approaches:
- The first approach infers the behavior of discrete control devices on the network, e.g., tap changing transformers and capacitor bank switching, whose control logic could be adjusted by an adversary without any anomalies appearing in grid measurements. We refer to these attacks as ‘reconnaissance attacks’ whereby an adversary seeks to verify their controllability of the network prior to a full-scale attack without being detected. Ongoing development is underway to generalize these algorithms to arbitrary networks and control schemes and alert an operator should these devices deviate from their inferred control logic.
The second detection approach is focused on the rapid detection, and corroboration with SCADA, of anomalies. A set of hierarchical anomaly detection rules and unsupervised methods are employed to rapidly detect anomalies and prioritize communication with a centralized control layer for further analysis. The rules are inspired by physical laws and operation limits of the grid that define what a “normal” behavior should look like.
The system architecture proposed would utilize a redundant, sparse, high-frequency sensor network, in this case using distribution PMUs, with analytics carried out at both local and centralized locations, as depicted in Figure 1. This hierarchical data analysis architecture offers numerous benefits:
- The co-location of an inexpensive processing platform, in this case a BeagleBone Black (BBB) acting as a middleware between the sensor and the network. This allows the layers of the scheme to be agnostic to vendor specific data formats and allows for accelerated implementation of sophisticated data encryption protocols.
- Assigning the BBB as the point of contact with the network allows for a unified system, with frequent updating of its firmware to mitigate the latest security risks as opposed to waiting for each sensor’s proprietary vendor specific firmware updates to mitigate risks.
- Local processing power allows localized analytics to be performed and the associated packets to be transferred with varying priority status and compression quality, dependent on the results of the analysis. This increases the efficiency of dataflow while minimizing communication delays for critical information.
Figure 1: Hierarchical Data Analysis Architecture
Within this distributed, hierarchical data analytics platform we integrate heterogeneous sensor types, such as SCADA traffic, which can be monitored using the Bro IDS alongside distribution PMU sensors, which in our case provide three-phase phasor measurements at 120 Hz. At the localized layer, the analytics can perform grid topology agnostic analysis and conventional IDS algorithms on SCADA traffic while the centralized analytics layer facilitates a hybrid analysis that leverages knowledge of the system operation and topology .
Due to economic constraints, the number of distribution PMUs deployed can be expected to be very small compared to number of nodes within distribution grids. Therefore, the sensors must be placed in some optimum manner, with respect to a predefined criteria, to achieve enough coverage for the events happening in the grid. We formulated an optimal PMU placement in  based on the ‘central rules’ corresponding to anomaly detection to have maximum coverage for event detection. Figure 2 shows the result of the placement of 20 distribution PMUs on IEEE 123 test feeder.
Figure 2: Sensor Deployment on IEEE 123 Node Feeder
Detecting “Reconnaissance Attacks”
The idea behind a reconnaissance attack is a fundamentally intuitive engineering concept, and could be seen as a component of the “reconnaissance” step in Lockheed’s “Cyber Kill Chain.” . Adversaries will infiltrate the network prior to an attack and may seek to confirm their controllability of the system and understand the response times of a systems components prior to instigating their planned attack on the network. This has suspected to have been the case in the 2015 attack on the Ukrainian power grid which left approximately 225,000 customers without power. A report into this attack indicated that the attackers had gained entry into the utility IP communications network up to 6 months prior to the attack itself , which allowed the attackers to monitor the system and enable a more sophisticated attack — one that is both more covert and more potentially damaging.
The approach we propose is to passively monitor the ambient behavior of the network and utilize measurements of both voltage and current, as well as the relative topological information of sensors and control devices, to identify specific control devices and infer their respective control logic. Once their control logic has been inferred, device behavior can be monitored over time for deviations from this control logic, which could be due to either an intentional parameter adjustment or an adversary testing its ability to manipulate control devices prior to an attack. Figure 2 represents such a potential case where a control action associated with a regulator equipped with line drop compensation has recorded an abnormal control action. In such a case, the SCADA packets associated with this control device would be examined to determine whether this change in operation was reflected in its packet’s contents. If so, an alert would be sent to the operator to confirm that this change was intentional. If, however, this change in behavior was not reflected in the SCADA packet an alarm would be sent to the operator indicating that a control device is behaving in a manner inconsistent with its control logic as reflected in its SCADA data. In this case the operator would take additional action given the possibility than an adversary has breached their network.
The approach discussed above requires that the accuracy of the independent sensor network be, at a minimum, comparable with the accuracy of the SCADA recorded measurements. This is necessary to confidently classify the control action as inconsistent, given that the adversary themselves will seek to cause a sufficient change in the control logic that they can validate in the SCADA measurement data. However, in our case, using µPMUs also gives much higher accuracy and frequency than SCADA, thus, in turn, making detection even accurate as well.
This type of analysis can be applied to any control devices that exhibit a discrete switching action, and is thus detectable in measurement data, given a sufficiently large sensor deployment. In this case, “sufficiently large sensor deployment” corresponds to the minimum size sensor deployment which achieves sufficient topological relational information, as to attribute measured switching action to individual devices. An interesting challenge in our work has been operating under the assumption of having insufficient sensors for observability, a paradigm we refer to as the “low measurement regime”. Even in this regime observations from sensors that are well positioned can reveal signatures of anomalous activity.
Rapid Online Anomaly Detection
In addition to detection of abnormalities in ambient behavior, our approach also has embedded, localized anomaly detection. Our anomaly detection approach employs both absolute detection thresholds and feeder-specific data-driven thresholds :
- Changes in the voltage magnitude measured on a distribution feeder that would be considered anomalous are generally feeder agnostic, i.e., where an instantaneous deviation of 0.1 p.u. would trigger a voltage sag or swell exception, regardless of the specific feeder properties. Therefore these absolute rules, with values specified by the utility in question, are implemented locally with a simple threshold triggering algorithm.
- Other quantities of interest, however, do not lend themselves to such generalizable universal thresholds. What could be considered abnormal behavior in the active/reactive power or current on one feeder may be normal behavior on another feeder, dependent on the customer mix and specific end-use loads. Therefore, we employ a semi-supervised behavior-based detection, which adaptively learns the normal behavior of the active/reactive power and current on a specific and flags anomalies if it records behavior outside of this normal regime.
- We also propose a quantity to track at each local node that can indicate when the system is no longer in the steady-state regime. Tracking anomalous behavior in this quantity also falls under semi-supervised behavior-based detection.
Figure 3 shows the result of running the localized rules on streaming data from one of our field installations following the tripping of a significant load . The behavior of the current at this location would be abnormal relative to other locations, however the algorithm is successfully able to learn the feeder specific normal behavior and successfully flag abnormalities.
Following the localized analytics, measured data is aggregated and algorithms which incorporate network topological information and data from multiple sensors are executed. These algorithms are termed ‘central rules,’ however, there may exist one or more further aggregation points above a particular node in the architecture where analysis may be carried out across multiple distribution networks connected through a common subtransmission network. These central rules incorporate topological into the analysis as well as any relevant SCADA data for corroboration.
With the expected increasing adaption of automation and DER in medium- and low-voltage distribution grids, the issue of cyber-security at this layer of the grid will continue to demand attention. As discussed within this article, we believe that the treatment of IT and power grid operations as independent isolated concepts results in some key insights being overlooked. Power systems are becoming ever increasingly complex with the interdependency of the physical behavior and the overlaid communication continuing to grow and understanding and protecting these complex systems requires a holistic approach, marrying both IT and operations.
The work discussed here highlights the additional benefit from a small number of high frequency sensors, ideally on an independent communication network, as well some of the algorithms and applications for both detecting an adversary prior to an attack and rapid detection of an attack, should SCADA data be masked or inaccessible. As the communication layer becomes further intertwined with the physical grid so too must data regarding physical operating conditions become woven into cyber security intrusion detection systems.
Acknowledgement: This research was supported in part by the Director, Office of Electricity Delivery and Energy Reliability, Cybersecurity for Energy Delivery Systems program, of the U.S. Department of Energy, under contract DE-AC02-05CH11231. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors of this work.
C. McParland, S. Peisert and A. Scaglione, "Monitoring Security of Networked Control Systems: It's the Physics," IEEE Security & Privacy , vol. Nov/Dec, pp. 32-39, 2014.
G. Koutsandria, R. Gentz, M. Jamei, A. Scaglione, S. Piesert and C. McParland, "A real-time testbed environment for cyber-physical security on the power grid," in Proceedings of the First ACM Workshop on Cyber-Physical Systems-Security and/or Privaacy, 2015.
M. Jamei, A. Scaglione, C. Roberts, E. Stewart, S. Peisert, C. McParland and A. McEachern, "Anomaly Detection Using Optimally-Placed μPMU Sensors in Distribution Grids," IEEE Transactions on Power Systems, 2017.
Lockheed Martin, "Cyber Kill Chain," [Online]. Available: https://www.lockheedmartin.com/en-us/capabilities/cyber/cyber-kill-chain.html. [Accessed 20 June 2018].
E-ISAC and SANS, "Analysis of the cyber attack on the Ukrainian power grid: Defence Use Cases," 2016.