It's easy to contribute articles, article proposals, commentary and analysis and be published online through Energy Central!
Sound interesting? Contact the editor for more information.
Utilities in compliance with today's guidelines for vulnerability and risk assessment and emergency response planning will have adopted viable plans to cover vital areas of technology, utility systems and business continuity. However, as utilities add new critical infrastructure and people to manage and maintain it, the viability of existing plans can come into question when utilities are exposed to damaging events. It follows that plans should exist to help ensure that any new and significant additions to infrastructure are recovered in a timely and safe manner under the NIPP, Critical Infrastructure/Key Resources (CI/KR) guidelines.1
This article will review some basic steps to help ensure that emergency response and recovery planning will be an integrated part of new project planning and help ensure that appropriate plans are installed. The risks associated with any shortfalls in planning could substantially add to recovery costs and undermine safety and security of utility personnel and the public.
Maintaining Compliance and Preparedness
First and foremost, utilities should have reviewed requirements and implemented the following, depending on the size, type and mission of the utility:
At key stages in the project, a discovery might be made that could force a utility to substitute a key component that may affect other components or their function when the entire system goes into operation. Contingency plans can help the utility deal rationally with design and implementation problems, but may not otherwise address problems caused by natural disasters after a project goes into commercial use. This leads us to the theme of this article: Ensure that critical systems supporting lifeline services are covered by emergency or disaster recovery plans.
When should disaster recovery planning begin? To be practical, it should be concurrently developed under the timeline used for new project planning rather than near the end of the project, or as an afterthought. However, if a new critical system is in operation and has not been included in emergency or disaster plans, any shortcoming should be addressed as quickly as possible in the appropriate plans.
Evaluating Systems for Planning Requirements
In the book Normal Accidents, author Charles Perrow distinguishes systems as being loosely or tightly coupled. Perrow explains: "In tightly coupled systems the buffers and redundancies and substitutions must be designed in; they must be thought of in advance. In loosely coupled systems there is a better chance that expedient, spur-of-the-moment buffers and redundancies and substitutions can be found, even though they were not planned ahead of time."2
Utilities employ systems that fall into both of the categories that Perrow mentions in his book. It is apparent that tightly coupled systems are looked at carefully during design and fabrication and are closely controlled when they go online. Power plants, for example, fall under this category. Supervisory control systems, or SCADA systems, also fall into this category. If SCADA telemetry to several substations is lost, it is critical that personnel be dispatched to monitor substation status. Natural disasters, cyber terrorism, and other events can bring about problems that affect complex systems. Emergency procedures are therefore a necessary part of the recovery of these systems.
Procedures are written and people are trained to respond to specific scenarios that affect system security and customer service. We write procedures to help ensure that people understand how these tasks must be completed (see my other articles posted on Energy Pulse. If training has been performed well, trainees will also understand the reasons behind why they need to perform tasks in a specific way. Of course, regular training is currently a part of compliance guidelines for utilities. Public power utilities, for example, are aware of these guidelines in USDA Rural Utilities Service Bulletin 1730B-2, Exhibit C: Guide for Electric System Emergency Restoration Plan, which refers to the essential nature of employee training.3 NERC also addresses the requirements for CIP training in Standard CIP-004-1 -- Cyber Security -- Personnel and Training.4
Some written procedures are valid during both routine and emergency conditions, and some are written specifically for emergencies or disasters. Emergency or disaster plans may contain strategies and task descriptions as well as specific procedures for responders. It's important to determine how any new system might perform under specific types of emergencies and whether it would be wise to extend existing plans to include recovery of those systems. If the utility employs smart-grid projects that channel critical data and device status control to and from consumers, for example, it would be wise to explore scenarios where these channels could be interrupted for long periods and develop plans accordingly. Also ask, what would be the status of these systems when re-energized and re-activated? Due to the high reliability and self-healing capability of fiber-based AMI systems, the bulk of storms experienced may not affect their operation appreciably; however, some devices in the field are still subject to physical damage under extreme storm scenarios.
It's important to think globally when the planning is underway. How will business continuity (continuity of business processes) and IT disaster recovery plans (recovery of systems and data) address full recovery and who will be required to respond under each plan? Electronic mail and communications tools used by emergency notification systems are vital, and along with customer data, records must be recoverable. These concerns and potential gaps would almost certainly be exposed by thoroughly updating a VRA and using the risk assessment information to help revise the utility's recovery and restoration plans. VRA data can offer the utility a sound baseline for determining priorities and how plans need to address them.
Summary of Steps to Consider
Responsible Planning. As the utility moves into more elaborate systems for monitoring and control, it follows that recovery plans should be expanded to include the recovery of new systems. Closely coupled with a new system improvement project is the responsibility to help ensure reliable and safe operation of the new system, subsystem, or component.
Project Contingency Plans. Review the risks associated with bringing a new system online. Evaluate the project risks and develop the necessary contingency plans to cover these risks before the new project goes through the approval process. As the project evolves, ensure that contingency plans are still valid and people know how to use them. Make sure that project meetings and reports include updates on possible contingencies and how the contingency plan addresses them.
Expanded Coverage in Emergency Plans. In parallel with the above contingency planning process, study how the new system could fail under emergency and disaster conditions or as the result of cyber terrorism and breaches in physical security. Study the links and dependencies that the new system will have with existing systems and procedures. How might the new system affect emergency operations and recovery? How will the new system affect customer service, internal and external communications, critical data flows, and the overall safety and security of employees and consumers?
Outline the necessary changes and additions in existing emergency and disaster recovery plans that will be required to recover the system following damaging events. Include appropriate plan implementation triggers to implement the plan when specified events occur. Make sure people are assigned to maintain the plans as systems evolve and people change jobs or leave.
Recovery Expertise. Training personnel to operate the new system(s) during routine and emergency conditions is also essential to project success. By the time the new system comes online, people should be aware of their roles and recovery procedures to ensure that losses are not compounded during the recovery process. Set up a program that includes regular refresher training and include lessons learned from past emergencies. Schedule exercises that include walk-throughs and people to observe and verify recovery knowledge.
Compliance and Recovery. Maintain the appropriate records and reporting schedules for all activities associated with critical infrastructure projects. Consult the record keeping requirements under each agency or organization's guidelines. Ensure that vital data is regularly backed up and off-sited and review required recovery metrics -- recovery time objective (RTO) and recovery point objective (RPO) -- for systems and data, on an annual or more frequent basis.
This article has briefly introduced the subject of maintaining viable contingency and emergency plans for utility systems and new projects. Revising and upgrading plans is certainly something that should be considered with each new and significant addition to utility infrastructure. Readers are urged to continue to investigate what is required to help minimize risk and maintain compliance.
References
1. Energy Critical Infrastructure and Key Resources Sector-Specific Plan as input to the National Infrastructure Protection Plan (Redacted) May 2007
2. Charles Perrow, Normal Accidents: Living with High-Risk Technologies (New York: Basic Books 1984), pp. 94-95.
3. United States Department Of Agriculture, Rural Utilities Service, Bulletin 1730B-2, Guide for Electric System Emergency Restoration Plan, January 7, 2005, Exhibit C, p. 12, and throughout.
4. NERC Standard CIP-004-1 - Cyber Security - Personnel and Training Adopted by Board of Trustees: May 2, 2006, Page 2 of 4, Effective Date: June 1, 2006
It's easy to contribute articles, article proposals, commentary and analysis and be published online through Energy Central!
Sound interesting? Contact the editor for more information.