What I am proposing for cloud “regulation”

Note from Tom

My most recent post discussed why I think the Risk Management for Third-Party Cloud Services Standards Drafting Team’s (SDT) effort to fix the “cloud problem” in CIP is on the wrong track; this is because the SDT is unlikely ever to finish the multiple tasks they’ve undertaken. In that post, I suggested several different courses of action for different parties – and described some courses of action that I think are dead ends – but I didn’t organize them in a coherent fashion.

A few days after I put up that post, a friend asked me what exactly I was proposing – and I realized I had never made that clear (probably because I was not clear about it myself). I started a new post to do that, which is this one. The previous post was my longest ever (both in number of pages and hours invested), but this post has far exceeded that one. I could break this post into multiple posts, but it is important to have all my ideas on how to address the cloud available in one place. As I discuss these ideas and amend them, I will try to include all those amendments in this post, even if I write about them in separate posts.

I want to point out that, while this post frequently refers to the previous post, all of the points made in the previous post are made in this one as well, although in some cases the reasoning behind those points is only found in the previous post. This means you don’t have to have read the previous post to understand this one. In fact, I recommend you first read this post and then, if you want to learn more about why I said particular things, you can read the previous one.

I am now planning two follow-up posts to this one in the next 2-3 weeks. One will be a “Cliff Notes™” version of this post, which will be confined – I swear it! – to two pages. The other post will lay out my ideas for the separate industry group I am now proposing. That group, which will not be part of NERC, will develop – with the participation of platform cloud service providers (CSPs), software as a service (SaaS) providers and managed security service providers (MSSPs) – a set of voluntary guidelines that address what I call “cloud native” cybersecurity risks. These are risks that are not addressed in the standards like ISO 27001 and FedRAMP, which the CSPs are regularly audited on. This post describes one example of such a risk.

To begin, what is the basic problem we are trying to solve by amending or adding to the CIP standards? It is that, except for some use of BCSI in the cloud today, NERC entities with high and medium impact BES environments are hardly using the cloud at all for OT purposes, even if they want to use it. Specifically, these entities are not 1) deploying BES Cyber Systems (BCS), Electronic Access Control or Monitoring Systems (EACMS) or Physical Access Control Systems (PACS) in the cloud or 2) utilizing BCS, EACMS or PACS that are already deployed there, either as SaaS (software as a service) or as a managed security service provided by a managed security service provider (MSSP).

Two problems to solve

There are two problems preventing this from happening.

The first problem is that NERC entities are worried about falling out of CIP compliance if they deploy or utilize BCS, EACMS or PACS in the cloud. This is not because that is forbidden by the CIP standards; in fact, the current standards nowhere mention the cloud. Rather, it is because NERC entities know that, if they deploy or utilize medium or high impact BCS, EACMS or PACS in the cloud under the current CIP standards, the Platform CSP, SaaS provider, or managed security service provider (MSSP) will never be able to provide them with evidence they need to prove compliance with the applicable CIP requirements.

This applies especially to “device-based” requirements like CIP-007 Requirement R2 (patch management) and CIP-010 Requirement R1 (configuration management). These requirements are applicable on the level of individual devices and need to be audited at that level. Most importantly, a CIP auditor needs to examine evidence from the individual BES Cyber Assets that make up the BCS, EACMS or PACS. There is no way this can be done for cloud-based systems, since they are usually “housed” in multiple devices (physical and virtual) in multiple data centers; moreover, they constantly move from one device/data center to another.

Surprisingly (including to me), even though the first problem is a big one, fixing it just requires making about seven relatively minor changes and additions to the current CIP requirements and definitions. It does not require rewriting the CIP requirements from scratch or changes to the NERC Rules of Procedure. In the previous post, I described the seven changes that are needed to solve this problem. If the current “cloud” SDT (or a new SDT) starts working on it soon, it is likely that all the changes can be drafted, approved by NERC and FERC, and implemented by October 1, 2028 (see below for a discussion of why that date is important for the cloud project).

The second problem, which is much harder to address, is that there are cybersecurity risks that NERC entities only face when they use the cloud; none of these risks are addressed in the current CIP standards, which were mostly written before the cloud was taken seriously by the NERC CIP community. These risks fall into two categories:

The first category is what I call “hybrid risks,” because they apply both to on-premises and cloud-based systems.[i] They include risks like unauthorized changes to system configurations (addressed through configuration management) and unpatched software vulnerabilities (addressed through vulnerability management). Hybrid risks are addressed in standards like ISO 27001 and FedRAMP, as well as in SOC 2 Type 2 audits. They are also addressed in the NERC CIP standards, although those standards currently apply only to on-premises systems.

Most of the major CSPs have ISO 27001 certification and FedRAMP authorization[ii], and have passed SOC 2 Type 2 audits. If a CSP has one or more of these and is willing to provide customers with audit reports that do not contain any unremediated adverse findings, I consider that to be sufficient evidence that they have addressed the important hybrid risks. Therefore, I see no need to “extend” the current CIP requirements to apply to cloud-based systems, since the platform CSP or SaaS provider will never be willing to provide the evidence that the NERC entity will need to prove compliance with a CIP requirement, even if they are able to do so.

The second type of risks that the NERC entity will face in the cloud are what I call “cloud native” risks; I discussed them at length in my previous post (where I used the term “cloud-only”). Two of these risks are multi-tenancy (which I discussed at length in that post) and the risk of a widespread outage like the one experienced by a major CSP recently.

The critical point about cloud native risks is that, even though they are often due to the actions or inactions of the CSP or SaaS provider, the impact of the risks being realized is primarily on their customers. Thus, the second problem is because many NERC entities are concerned about the impact of cloud native risks on their BCS, EACMS and PACS, if they deploy or use them in the cloud. Those entities need somehow to be reassured that the CSPs are aware of these risks and have controls in place to mitigate them.

Since hybrid risks are already mitigated by any CSP or SaaS provider that can provide the appropriate audit reports (as described above), this means the second problem boils down only to cloud native risks. Are NERC CIP requirements the appropriate tool to mitigate those risks? While I had always answered this question in the affirmative until a couple of weeks ago, I have now changed my opinion. I now realize that the NERC standards development process – always long and cumbersome at best – is completely unsuited to the task of mitigating cloud native cybersecurity risks.

In fact, as I described in the previous post, just the process of a) developing NERC CIP requirements for each of the 17 cloud native risks identified by the SDT and me, b) going through the lengthy balloting process for NERC entities, c) waiting for FERC to study and approve the new or revised standard(s), and waiting out the multiyear implementation process will take…hold on to your seat…2-3 decades. And that assumes the team will ever finish their work - which they will not, since new cloud native risks are being identified all the time.

This demonstrates that the NERC standards development process is incapable of addressing either hybrid or cloud native risks. If NERC entities, as well as the public, are ever going to be assured that it’s safe to deploy systems that control or monitor the North American Bulk Electric System in the cloud, we need to find Plan B for doing that.

What are the options?

At the bottom of this post is a table describing eight tasks to address cloud risks that have been proposed by NERC, the SDT or me. Immediately below is a discussion of each of the tasks.

1. Enable BCSI storage and use in the cloud.

In 2017 and 2018, the NERC community began to realize that the cloud was no longer an experimental technology with questionable security, but rather a tool that could yield great benefits to the Bulk Electric System if proper precautions were taken. However, the community also realized that the current CIP requirements were drafted before the community thought there was a chance that the cloud would ever be used for critical infrastructure systems.

Because of this, there are many CIP requirements today whose wording inadvertently makes it impossible to implement or utilize medium and high impact systems subject to CIP compliance in the cloud. This is even though nothing in the current CIP standards prohibits any OT system from being implemented or used in the cloud.

In 2018, NERC realized that there were two “CIP/cloud” compliance problems: 1) BCSI in the cloud and 2) BCS, EACMS and PACS in the cloud. NERC also realized that the first problem was much easier to address than the second one. Therefore, a Standards Authorization Request (SAR) was approved and a Standards Drafting Team (SDT) was constituted to focus on the BCSI question. The team got to work (during the pandemic) and solved the problem very elegantly, with a minimal set of changes made in CIP-004-7 and CIP-011-3. These changes came into effect on January 1, 2024; no further changes are needed. However, there has been only a modest amount of BCSI use in the cloud since that date, primarily due to the lack of compliance guidance. That will hopefully change soon.

2. Draft, approve and implement seven simple changes to existing CIP standards.

In my previous post, I recommended seven simple changes to existing CIP requirements or definitions. If these changes are implemented, I believe all the regulatory barriers to implementing or utilizing BCS in the cloud, EACMS in the cloud, and PACS in the cloud will be removed.

By contrast, the current SDT is focused on the goal of entirely rewriting the CIP standards to accommodate the cloud (this is Task 8 in the table). While I formerly thought that was the correct approach, I no longer think it is. I say this because (as I described above) I now realize that trying to do this will take at least 2-3 decades, which is another way of saying the SDT will never complete their work. Instead, it is much better to take a minimalist approach and only change what is required to enable full cloud use by all NERC entities. I believe that the seven simple changes I identified in the previous post are all that are required, but an SDT needs to make the final determination on that. After the changes are implemented, there will no longer be a reason for a NERC entity to believe they are prevented from implementing BES systems in the cloud.

These changes have been needed for a long time. However, as I pointed out in the previous post, there is an even more urgent need to accomplish this task as soon as possible. This is because CIP-015-1, the Internal Network Security Monitoring (INSM) standard recently approved by FERC, will become enforceable on October 1, 2028. NERC entities are already having discussions with vendors of services for INSM, most of which are based in the cloud. As they do this, they will usually run into the “EACMS problem.”

The NERC Glossary defines EACMS as “Cyber Assets that perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems.” A substantial number (certainly over 50) of CIP Requirements and Requirement Parts apply to EACMS; any device or system that meets the EACMS definition must comply with every requirement or part that applies to EACMS.

Since it’s likely that any system or service that performs INSM in a high or medium impact BES environment also performs “electronic access monitoring” of an ESP or BCS in that environment, this means that cloud-based INSM services are likely to be judged by an auditor to be EACMS. This in turn means the auditor will ask the entity to provide compliance evidence for all the requirements and requirement parts that apply to EACMS – which will literally be impossible for a platform CSP, SaaS provider or MSSP. Task 2 will fix the EACMS problem, as well as the closely-related BCS and PACS problems (all three of these are discussed in the previous post).

As already discussed in this post and the previous one, it will be impossible for a cloud-based managed services provider to furnish evidence that they have complied with all the requirements and requirement parts that apply to EACMS. This means that, unless changes like the seven I’m proposing for Task 2 are implemented before 2028, many or most cloud-based INSM services may be off limits for NERC entities when CIP-015-1 becomes enforceable.

Even though I think Task 8 will require 2-3 decades to accomplish, I do not think it will be difficult for the SDT to accomplish Task 2 in three years. This is because the seven changes in Task 2 should be relatively easy to draft. They are also likely to be quickly approved by the NERC Ballot Body[iii], the NERC Board of Trustees, and FERC.

I think the current SDT should turn their attention from what they are now trying to do (which is Task 8) to addressing Task 2. However, if they don’t want to do that, another drafting team will need to be formed to address Task 2. Either way, the first step in Task 2 will be to draft a new Standards Authorization Request (SAR) that just requires the SDT to identify the “minimal set of changes required to allow BCS, EACMS and PACS to be deployed and used in the cloud.”

3. Mitigate current “cloud native” risks through voluntary guidelines.

As I discussed above, there are two types of risk that cloud users face: hybrid and cloud native risks. Hybrid risks, which apply to both on-premises and cloud-based systems, are almost all addressed by ISO 27001 and FedRAMP. Therefore, I think just reviewing the CSP’s most recent ISO 27001 audit report and determining if there are any negative findings that have not been remediated, should be enough to give most cloud users confidence that the CSP has mitigated the important hybrid risks.

However, cloud native risks are a quite different matter. I do not believe there are any published standards that describe controls to mitigate these risks or even define the risks in the first place (although please email me if I’m wrong). No matter how the risks are mitigated – i.e., whether through CIP requirements or voluntary guidelines for the CSPs – the first steps will be to identify all significant cloud native risks, define them rigorously, and then describe practical mitigations for them. The only way to do this is to convene a group that includes end users (in this case, electric utilities and IPPs) and providers of cloud-based services (including platform CSPs, SaaS providers and managed security service providers - MSSPs); the group will draft voluntary guidelines for the service providers.

The second reason is much more practical: In addition to defining cloud native risks and mitigations for those risks, a NERC standards drafting team will need to develop mandatory requirements that carry potentially huge fines for each violation. They will then need to shepherd these new requirements through the years-long NERC balloting and approval process and the FERC approval process. I described these processes in some detail in my previous post. In that post, I estimated that going through these processes for 17 cloud native risks that have already been identified by the SDT and me (which are probably just a subset of the total cloud native risks identified today, with more to be identified every year) will take 2-3 decades.

This is why the identification of cloud native risks and their mitigations needs to be kept out of the NERC standards development process altogether. Instead, there needs to be a group outside of NERC (probably part of a nonprofit organization) that undertakes this process and develops a voluntary risk management framework that describes controls for cloud native risks to the Bulk Electric System[iv]. I will write a new post soon discussing my ideas about this framework and the body that will develop it.

4. Mitigate “hybrid” cloud risks.

To summarize what I stated earlier, I define hybrid cloud risks as risks that apply to an organization’s on-premises and cloud-based systems (one example of a hybrid risk is compromise or loss of information stored or used in the cloud). Standards like ISO 27001 and FedRAMP apply to both types of systems; moreover, both standards are very comprehensive. Therefore, I consider the fact that a cloud-based service provider is certified for ISO 27001 compliance, or that certain federal agencies are authorized by the Joint Authorization Board to utilize a cloud-based provider based on their FedRAMP compliance, to be evidence that the provider has implemented the controls listed in those standards.

However, the outside group described earlier will request the most recent audit reports from each cloud-based service provider that joins the group[v] and verify with the provider that any negative findings in the report have since been mitigated. If a negative finding has not yet been mitigated, the group will require the provider to describe their timetable for doing so.

5. Mitigate cloud native risks identified in the future.

When I pointed out above that the “Cloud” SDT will need to be in session for 2-3 decades, just to develop CIP requirements that mitigate all of the cloud native risks that have been identified today, I didn’t mention that new cloud native risks are being identified all the time .

For example, in this post, I discussed a vulnerability that was identified by a researcher in the past year, that could allow any tenant in Azure to access certain servers in other tenants through a normal API connection. These include servers for Google Mail (formerly Gmail), Salesforce and Azure Vault, and many others as well). In fact, the researcher warned that the same vulnerability could be present in other cloud environments besides Azure.

Needless to say, this vulnerability alone strikes at the heart of a fundamental assumption about the cloud: that the millions of organizations that utilize a cloud infrastructure like Azure or AWS are perfectly “walled off” from each other. If that assumption is wrong and there is an easy way to exploit this vulnerability, that would have a huge impact on organizations of any size, anywhere.

This is just another reason why the NERC standards development process is unsuited to mitigating cloud native cybersecurity risks and why an outside group, including cloud-based service providers, should develop voluntary guidelines for the CSPs: The work will never end, although the team members will. Instead of spending 1 ½ to 2 years on every new risk that comes along, as the SDT would have to do (see my previous post for how I arrived at that time range), the outside group I’m proposing will be able to break their workload up among as many sub-groups as are needed to deal with every new risk simultaneously – perhaps within 3-4 months.

6. Revise NERC Rules of Procedure (RoP) to allow auditing of objective-based requirements.

Around half of the NERC CIP requirements are prescriptive ones, meaning they require the NERC entity to take specific actions. If the entity doesn’t take those actions within the required time, or if they don’t perform them correctly, they may be subject to hefty fines.

Examples of prescriptive requirements are CIP-007 R2 patch management and CIP-010 R1 configuration management. These are often cited as the two most prescriptive CIP requirements, but they are not coincidentally also cited as two of the most expensive NERC requirements (not just NERC CIP requirements) to comply with. This is because the only way to comply with prescriptive requirements is to produce and store copious amounts of documentation and put in place an elaborate program to ensure there will not be violations – or at least as few as possible.

Fortunately, around half of the CIP requirements – and all the new CIP requirements and standards drafted since CIP version 5 came into effect in 2016 - are objectives-based. For these requirements, the NERC entity is required to achieve an objective, not to take specific steps while achieving it. For example, CIP-007 R3 reads, “Deploy method(s) to deter, detect, or prevent malicious code.” In theory, a NERC entity will be compliant with this requirement if they can demonstrate that they have deployed one of these methods. This sounds simple. What could possibly go wrong?

In fact, a lot could go wrong. Suppose a NERC entity decides they can save time (and probably money) by just updating their software whenever a new update is available. Of course, that is a method to “prevent” malicious code; however, this method only prevents malicious code that was known to be “in the wild” before the last update. The software will still be vulnerable to any malware that was developed since the last update.

What if an auditor hands that entity a “potential non-compliance” (PNC) finding for CIP-007 R3 at their next audit? The entity may point out that their compliance methodology prevents malware infection, but not for the most current malware. The auditor might then point out that it’s a dangerous world out there; they should be especially concerned about the most recent malware. The entity comes back with something like, “Please show me where in ‘Deploy method(s) to deter, detect, or prevent malicious code’ it says we need to prevent any malware, no matter how recently it was released.”

At this point, the auditor might stammer and say something like, “You should at least have protection against malware that is more than one day old.” The entity will reply, “Again, please show me where in ‘Deploy method(s) to deter, detect, or prevent malicious code’ it says we need to prevent any malware that was released before yesterday. After all, you always say that we need to follow the strict language of the requirement.”

Of course, the auditor will not have a good response to that, since there is no good answer. The auditor will probably point to various forums – such as Regional Entity meetings and NERC webinars - at which examples of compliant practices were discussed, and suggest that the entity should have been attending them. However, both the auditor and the entity know that it is impossible to list all the methods to comply with an objectives based requirement. The fact that some practices were mentioned at a meeting does not mean other practices will not be compliant as well.

The problem here is that the NERC Rules of Procedure do not allow for any kind of compliance except following the strict language of the requirement. However, it is usually impossible to define exactly what it means to meet an objective, without also prescribing the means of obtaining that objective. This makes it impossible to write an auditable objective based requirement. How do we resolve this problem?

I can imagine a way to audit compliance with an objective based requirement and still be able to make a clear determination of compliance or non-compliance: The auditing rules (in the RoP) could be changed so the auditor’s job is to make an informed judgment about whether the entity has made a good faith effort to achieve the objective. If the auditor determines that the entity has at least done that, they will not receive a PNC notification. However, if the auditor decides the entity has not made a good faith effort, they might issue a PNC.

I’m sure some people (including some auditors) will object that requiring an auditor to determine whether an entity has acted in good faith will introduce subjectivity (horrors!) into the auditing process. But guess what? Cybersecurity is a risk management exercise. There is no way to be certain whether a risk has been mitigated or not; in fact, it is impossible to eliminate a true risk, but only to make it as small as possible.

Both the auditors and the NERC entities understand this. They have arrived at what amounts to an uneasy truce, when it comes to audits of objective based requirements. However, before all CIP requirements are made objective based, it is essential that the Rules of Procedure provide a way for an auditor to use their judgment as to whether or not a NERC entity has achieved a particular objective, without feeling they’re doing something wrong.

Another change is needed as well. Today, auditors – and almost anybody employed by NERC or the NERC Regions - are not allowed to provide compliance guidance to NERC entities, for fear that it will compromise “auditor independence”. That term means the auditor has no personal stake in the outcome of the audit. Auditor independence is presumed to be compromised if the auditor, their Region, or NERC itself has provided guidance to the entity (and perhaps other entities as well) on the preferred method(s) of complying with a requirement.

Another way of saying this is that, if an auditor tells you how to comply with a requirement and then audits your compliance, all they are doing is auditing themselves. This is why to this day (e.g., in the recent BCSI webinar), NERC and Regional staff members never provide compliance guidance in a public forum, unless what they say will only be heard by the people in the room.

This might be a good rule for audits of the NERC Operations and Planning standards, which are ultimately based on the laws of physics, it’s a terrible rule for the CIP standards. This is because cybersecurity is inherently a risk management exercise with a (usually) vaguely defined objective and multiple ways to achieve the objective.

Moreover, cybersecurity terms seldom have a completely agreed-upon meaning. For example, consider the word “programmable” in the definition of the term “Cyber Asset”, and the word “routable” in the term “External Routable Connectivity”. Both of these terms came into use with CIP version 5 in 2016. Despite huge arguments and multiple NERC attempts, always unsuccessful, to “define” them without going through the multiyear process required to draft a new definition and get it approved, these terms remain as ambiguous today as they ever were. However, since NERC entities need some sort of guidance on these and many other issues, the NERC Regions have done their best to fill the gaps with verbal guidance.

The fact that auditors cannot provide compliance guidance to NERC entities with CIP questions would not be terrible if there were no adverse consequences of not providing guidance, other than that a NERC entity might be cited for violating a requirement they didn’t understand. However, consider what NERC constantly says about the CIP standards: They are needed to protect the US and Canada from potentially damaging cyberattacks. Plus, the fact that there has never been a power outage caused by a cyberattack in North America is evidence of this.

However, it is precisely because CIP compliance is so important for grid security that making sure NERC entities are always compliant is essential. What is the better way for NERC to ensure that entities are always compliant? By refusing to provide them with compliance guidance? Or by going out of their way to make sure each entity understands the CIP requirements and definitions and has implemented their understanding correctly? It seems to me that Door Number Two is the correct answer.

Thus, if the NERC CIP requirements are to be made completely objective-based (which is a new objective – pun intended – of the “cloud” SDT), at least the two new rules discussed above need to be added to the Rules of Procedure. However, the two rules should only apply to the CIP standards, not the Operations and Planning standards.

Nobody I have talked to has been able to tell me how the Rules of Procedure can be changed, except that NERC’s legal department will probably need to lead the effort. I am sure the changes will need to be drafted by some team and submitted for approval by the NERC balloting process, by the Board of Trustees, and by FERC. Not knowing anything about how the process will work, I’m estimating it will take 2-3 years by itself. It would be best if the RoP changes were made before work on Task 7 begins. However, if that is not possible, the two could probably be carried on at the same time, as long as there is coordination between the two drafting teams.

7. Revise current CIP requirements to make them objective-based

Since at least 2018, I have been advocating that the CIP standards should all be made objective based, although I used the almost equivalent term “risk based”. I did this because I could see by then that the CIP version 5 standards, which came into effect on July 1, 2016, had not achieved their objective of being clearly understood, as well as transparently auditable (in fact, I wrote nine chapters of a book that made that point. I put the book aside when I got too busy in the fall of 2018. I haven’t returned to it since then, although I would love to do that if I could).

Because I was sure that almost nobody in the NERC community would be up to conducting another fundamental rewrite of CIP - when the first rewrite (CIP v5) had taken six years from the beginning of drafting in 2010[vi] to the 2016 compliance date – I didn’t mention this very much. Instead, I started hearing from more people about the need to make it possible for NERC entities to make full use of the cloud for their OT systems. I began wondering how the CIP standards might be revised to allow full cloud use, for NERC entities that want to do that.

I was not alone. By 2023, Lew Folkerth of the RF Regional Entity was becoming increasingly alarmed that more software products and security services were moving exclusively to the cloud all the time. Early in that year, he formed the informal Cloud Technical Advisory Group (CTAG) to start discussing this issue (the group continues to meet biweekly). Others started discussing it as well. The current Risk Management for Third-Party Cloud Services Standards Drafting Team started meeting in the summer of 2024 to draft new and/or revised CIP standards to accomplish that purpose.

The SDT spent its first 5-6 months discussing and drafting a revised Standards Authorization Request (SAR); the SAR serves as the “roadmap” for an SDT. That SAR was approved in December 2024. The primary element of the project scope is “Create a new CIP standard(s) or revise the existing CIP standards, as appropriate, to allow for adoption of cloud services for CIP-regulated systems while maintaining appropriate levels of reliability, resiliency and security.” The SAR also includes a list of eleven “risks related to cloud services for CIP applicable systems” (which I now call cloud native risks), that need to be addressed in new CIP requirements.

I interpret this language in the SAR to mean that:

A. The primary goal of the SDT is to draft new or revised CIP standards that allow BES Cyber Systems, PACS and EACMS to be implemented in the cloud, and to allow NERC entities to utilize existing SaaS implementations that meet the BCS, EACMS or PACS definition. None of these actions are currently “permitted” for medium or high impact BES environments. This is not because any CIP requirement prohibits cloud use, since no requirement even mentions the cloud. Rather, it is because use of the cloud was not seriously contemplated when most of the current CIP requirements were drafted. As a result, many requirements are written in a way that precludes a cloud service provider (CSP) from providing the compliance evidence that a NERC entity would need to present during a CIP audit.

B. Another goal is to maintain “reliability, resiliency and security” of CIP-regulated systems deployed in the cloud. The best way to achieve this goal is to mitigate the cloud native risks.

C. A third goal is not to require NERC entities, who do not wish to utilize the cloud for CIP-regulated systems, to make any changes to their existing CIP compliance programs. While this is not mentioned in the revised SAR, it was mentioned in the original SAR approved by the Standards Committee in December 2013.

The SDT started meeting again in January. They have continued to meet throughout the year, although at a more reduced schedule than originally planned. As often happens, their goals have changed during the year. They are now working on a white paper that they hope to have available for comment by the end of the year, in which they will describe what they are trying to achieve.

However, having attended a healthy number of their meetings this year, I believe the SDT’s current goals are to:

I. Let the existing CIP standards remain in effect (probably with a few necessary minor changes that will not affect what the Responsible Entity is required to do), for NERC entities that do not want to change anything they are doing now.

II. Develop a second set of CIP standards (called the “100 series”: CIP-102, CIP-103, etc.), whose requirements will be based on the existing CIP requirements. However, those requirements will be adapted to address cloud-based assets.

III. Develop a new standard, CIP-016, that includes requirements that address at least the eleven cloud native risks included in the revised SAR approved in December 2024. Since these risks do not apply to on-premises systems, the requirements in CIP-016 will only apply to systems deployed in the cloud (Note: While the SDT started to discuss CIP-016 early this year, it had a different purpose. I believe it would have encompassed all the requirements in I and II, since the SDT had not yet thought of creating the 100-series standards).

IV. Starting in September (during the SDT’s only onsite meeting of the year) or even earlier, the SDT significantly expanded their scope by adding a new objective: rewriting the current CIP requirements so they address on premises systems (BCS, EACMS and PACS), as well as cloud based systems. To make all this work, the existing CIP requirements also need to be made objective based, if they are not that already. The SDT intends this to be the next version of the CIP standards; however, NERC entities that do not need to use the cloud will not need to move to it for at least a few years after the new standards take effect.

I think that, if the SDT tries to move forward with the above plan (although they will not move anywhere until they have been able to get comments from the NERC community on their white paper), they will be making a huge mistake. Here is what I think about each of the above four items:

i. I have no objection to leaving the existing CIP standards in place, for entities that do not want to change what they are doing now. In fact, the SDT cannot be successful unless they do this.

ii. I see two major problems with the idea of adapting existing CIP requirements to cloud based systems:

a. Even though the existing requirements all nominally apply to BES Cyber Systems, which in theory could be deployed in the cloud or on-premises, in fact they really apply to devices. This is because a BCS is defined simply as a grouping of one or more BES Cyber Assets – and a BCA is a physical device. The BCAs that comprise a BCS, whether physical or virtual, cannot be tracked in the cloud, since parts of systems move from device to device and data center to data center all the time. Presumably, the CSPs could track this, but doing so would involve a huge amount of work. The NERC entity would very likely have to pay for it.

Thus, no platform CSP, SaaS provider or MSSP will ever be able to provide compliance evidence based on physical or virtual devices. This is why I do not believe there is any way to address both on-premises and cloud-based systems in one requirement (unless that requirement is broken into two parts, one for cloud based systems and one for on premises systems. However, these would effectively be separate requirements).

b. Since the existing CIP requirements and requirement parts all address hybrid risks, and since hybrid risks are already addressed in ISO 27001 and FedRAMP, it will be a huge waste of time for NERC entities to go through all the steps required to prove that their CSP is compliant with for example CIP-007 R2 patch management – when patch management controls are already addressed in both frameworks. It is almost certain that the CSPs, SaaS providers and MSSPs will not provide any compliance evidence for hybrid risks, other than to point to recent audit reports for ISO 27001 or FedRAMP (or possibly a SOC 2 Type 2 audit report).

iii. I have just one objection to this item; it is one I arrived at only recently. First, I want to note that, even though the eleven cloud native risks were featured prominently in the SAR, I do not believe the SDT has even begun to discuss them, almost one year later. When the SDT does that, they will realize that just describing cloud native risks in a format that can serve as the basis for mandatory requirements will be very difficult, absent participation by cloud technical experts (preferably, current or former staff members of the platform CSPs, SaaS providers, or MSSPs).

Similarly, absent such help, it will be difficult for the SDT to a) draft workable requirements based on cloud native risks, b) respond to the huge number of objections that are likely to come in during the balloting period (which will run for at least one year), c) make changes to the requirements between ballots in response to those objections (every SDT does this, in order to receive more positive votes in the next ballot), and finally d) develop the filing to FERC that will explain how and why they developed each new requirement.

In my previous post, I stated that I think it is likely that, when you add up all the work the SDT will need to perform for each of the cloud native risks (although I called them “cloud-only” risks in that post), it is likely it will amount to 250 hours per risk over the SDT’s lifetime; this amounts to 20 months (in the previous post, I used the example of the multi-tenancy risk to illustrate how I arrived at this number). I also estimated that the SDT meets for 150 hours a year (that may be a high estimate, at least for this year). Thus, it will take the SDT 250/150 years, which equals about 1.66 years or about 20 months, to address each risk.

How many cloud native risks are there? In the previous post, I added the eleven risks identified by the SDT to five risks that I identified in previous posts, to get sixteen risks. To determine the total amount of time that it will take the SDT to address these sixteen cloud native risks, I multiplied sixteen times 20 , which equals 320 months. This is…envelope, please…about 27 years.

Of course, this is a huge number. Am I sure about this? No, I’m not; I think it may be too low. I listed three reasons why it is almost certainly too low in the previous post. However, let’s assume that my 250-hour estimate is twice what it should be. That changes the estimate from 27 to 13 ½ years. Does that really change anything?

Even if we use 13 ½ years as our estimate of the time to address the cloud native risks (i.e., item III in the SDT’s current goals listed above), the SDT has a lot more than that on their plate. In the previous post, I estimated that addressing items I and II in the SDT’s goals will require 7 years. Thus, we can safely say that the SDT will require 20 years to complete items I to III; that does not even include item IV, which the SDT clearly wants to add to the mix.

Of course, it would be ridiculous to continue these calculations. It is clear that, if the SDT is going to develop mandatory NERC CIP requirements to address cloud native risks, they need to be prepared to spend at least a couple of decades on this task. Since that is impossible, we need to move away from regulations and go with Plan B.

8. Revise current CIP standards to apply to both on-premises and cloud-based assets.

I have Task 8 in the list because I believe that is what the SDT currently wants to do, not because I’m recommending that they continue to do it; in fact, I’m recommending they immediately change course, although that’s unlikely to happen. While Task 7 consisted of items I-III above, Task 8 adds item IV. However, since I think achieving item IV is impossible, this means that Task 8 is also impossible.

So, what do I recommend?

The table below compares the eight tasks, which I have discussed in the previous eight sections of this post. Note that these are all tasks that have been proposed, although I don’t support some of them. Given that it is imperative to fix the wording problems that currently prevent NERC entities from deploying or using medium or high impact systems in the cloud, I see only one path forward:

A Standards Drafting Team performs Task 2, while an outside group performs Tasks 3-5. I believe this path will a) fix the CIP wording that prevents full cloud use by NERC entities and b) mitigate cloud native cybersecurity risks by creating a group that includes power industry representatives and platform CSPs, SaaS providers and MSSPs (managed security service providers). The outside group will draft voluntary guidelines for cloud providers that address currently identified cloud native risks; in addition, the group will meet regularly in the future to identify new cloud native risks and develop guidelines for the cloud providers to mitigate those risks.

However, the current SDT has introduced a new goal. This goal has nothing to do with the cloud, but I have believed for years that it is important to achieve. That goal is to rewrite all the current CIP requirements (for on-premises systems only) as objectives based. In my opinion, pursuing Tasks 6 and 7 (but not any of the other tasks), will achieve that goal. This is a separate path that can be pursued independently of the first path:

An SDT (probably not the same one that performs Task 2) will lead Tasks 6 and 7. However, before the rewritten requirements become effective, the NERC Rules of Procedure need to be revised, as I described earlier. The NERC legal department will need to be heavily involved in this effort, although the SDT can help them understand the changes that are needed.

First, the SDT needs to spend up to a year discussing ideas for rewriting the current CIP standards with NERC entities, then drafting a new SAR. It would be a big mistake to rewrite the current CIP standards without doing this, since many NERC entities have strong ideas about changes they would like to see in the CIP standards. They will want to have their ideas heard at the beginning of the project, before drafting begins; if that doesn’t happen, they may vote down the revised standards after they’ve been developed.

If the Rules of Procedure are changed first, I estimate this entire project will take 9-10 years. If the RoP changes are made simultaneously with the SDT work, the project could take seven years. As I pointed out in the previous post, the last (and only) time the CIP standards were completely rewritten was the development of CIP version 5. Six years passed between the beginning of discussions on v5 in 2010 and the implementation of v5 (as well as CIP v6. That’s a long story) on July 1, 2016; I doubt the time required for Task 7 will be any less than that. Because I think the SDT will need to hold at least six months of discussions with NERC entities and then spend six months revising their SAR before starting the drafting process, that adds another year to my estimate.

Thus, my overall recommendation is that both of the above paths be pursued using separate drafting teams (the current SDT could choose which path they want to pursue. My guess is they will want to pursue the second path).

Substack is now my primary blog platform. If you want free access to new posts for up to 30 days, you can become a free subscriber. If you also want full access to the over 1200+ posts I have written since 2013, please become a paid subscriber for $30 a year. Thanks!

If you would like to comment on what you have read here, I would love to hear from you. Please email me at [email protected] or comment on this blog’s Substack community chat.

[i] Even though they can be applied in non-CSP environments, both ISO 27001 and FedRAMP are used for CSP certification (ISO 27001) and authorization (FedRAMP) because they address risks to devices that are installed in CSP data centers. However, they don’t address risks that apply to the customers of cloud services, including risks like multi-tenancy and the risk of widespread outages.

[ii] FedRAMP authorization only applies to use of a CSP by certain federal agencies. However, the fact that a CSP has FedRAMP authorization at a High, Medium or Low level can be justifiably taken as an indication of the security controls they have in place.

[iii] “Ballot Body” refers to the fact that, although all NERC entities are eligible to vote on every proposed or revised standard, they are not obligated to do so, especially if the standard(s) in question do not apply to them. When an SDT has finished drafting a new or revised standard(s) and NERC is preparing to conduct a vote to approve it, they first ask each entity whether they wish to participate in the balloting. Since NERC’s balloting rules are very complicated – there are about 12 different classes of ballots, and I believe that approval requires a supermajority in favor in each class – almost all of the new or changed CIP standards have required four ballots to pass. Each ballot, and the accompanying comment period, usually requires about three months, except for very non-controversial ballots.

[iv] Since cloud native risks apply to other critical energy infrastructure industries besides electric power – for example, natural gas pipelines, natural gas distribution, petroleum refining and petroleum products distribution – I could see this group being expanded to include those industries as well.

[v] The group will have no authority to compel a service provider to let them see an audit report, but the provider will not be allowed to join the group if they refuse to do so.

[vi] In early 2010, the CSO706 SDT, after having drafted, got approval for and implemented CIP versions 2 and 3, introduced a draft CIP version 4. This was a complete rewrite of the eight standards in v3, numbered CIP-002 through CIP-009, into just two standards, numbered CIP-010 and CIP-011. These drafts met with overwhelming disapproval, and they were never even submitted to balloting.

At that point, some members of the NERC community became concerned that FERC – who had ordered a complete rewrite of CIP when they approved v1 in January 2008 in Order 706 – might lose patience with NERC and write a harsh standard of their own. Since one of the mandates in Order 706 was that NERC develop “bright line criteria” for identifying the assets (mainly Control Centers, generating stations and transmission substations) that were in scope for CIP compliance, the SDT decided to keep the current CIP v3 standards in place, but replace the admittedly overly lax rule for identifying Critical Assets with the bright line criteria, which they had already begun to draft. This became CIP version 4, which was approved by NERC at the end of 2010 and by FERC in April 2012 (with an April 1, 2014 enforcement date).

In early 2011, the SDT started work on CIP version 5. This version included the bright line criteria as Attachment 1 of CIP-002. However, it replaced everything else in CIP v4 with a completely new compliance structure based on the concept of BES Cyber System, which the SDT had developed in a “concept paper” in 2009. The SDT finished drafting v5 in late 2011 and submitted it to the NERC balloting process, which lasted through 2012 and ended with NERC Board approval (and submission to FERC) at the end of 2012.

I believe the fact that v5 was approved so quickly by NERC (after “only” one year of balloting!) convinced FERC that there was no reason to allow v4 to come into effect in 2014, and then v5 in 2015 or 2016. In April 2023, FERC issued a NOPR stating their intention to approve CIP v5 and “disapprove” v4. As I said earlier, v5 came into effect on July 1, 2016, while v4 never came into effect.