The other downside of massive data centers

By now, I think we all know that the buildout of so many huge data centers is causing electricity bills to rise and power supplies to be constrained. However, there’s another risk that in some ways is a mirror image of this one. Moreover, this other risk is even more serious, because it could potentially lead to a sudden outage of a regional power grid or even an entire Interconnect.

Note: There are four Interconnects in North America: the Eastern Interconnect, the Western Interconnect, ERCOT – which covers most of Texas - and Quebec. Within each Interconnect, AC power can flow freely, which means a disturbance in one area will usually be quickly compensated by flows from other areas of the same Interconnect. However, if the disturbance is severe enough, it won’t be compensated and will bring down the entire Interconnect. That almost happened in ERCOT in the early hours of February 15, 2021, as described in this post and this one.

Interconnects are sometimes connected to each other. When this happens, the connection is DC, not AC. If one Interconnect (or even a large part of one Interconnect) experiences problems, power can be dispatched from another Interconnect using a DC line. However, unlike AC flows, which don’t have to be “dispatched” by anybody, dispatching power over DC requires human intervention. This means there will be a lag of at least a few seconds before the Interconnect with the disturbance receives relief.

If the disturbance is severe enough, this delay will not prevent that Interconnect from going totally black. This was in danger of happening in ERCOT for about five minutes on February 15 – before the control center staff triggered a massive 6.5 gigawatt load shed, which intentionally blacked out a large portion of the ERCOT grid, but at the same time avoided an uncontrolled shutdown of the entire grid. Full recovery from that might have required months of time and billions of dollars.

Today, the Wall Street Journal carried a well-written article [i] that started this way:

“Early last year, a cluster of data centers in Virginia suddenly dropped off the power grid, threatening the stability of the already vulnerable system.

The roughly 40 data centers, which had been using enough electricity to supply more than one million homes, simultaneously switched to backup power sources in February 2025, when a high-voltage power line malfunctioned. The sudden plunge in electricity demand forced the grid operator to take quick action to avoid potentially serious damage.

The incident, details of which haven’t been reported, was the second such problem in Virginia within a span of months. In July 2024, about 70 data centers withdrew from the grid when another high-voltage line failed, requiring a similar scramble to keep power supply and demand in line.”

Of course, it stands to reason that too little electricity supply will cause problems, such as perhaps rolling outages and even large scale blackouts. But why would too much supply (which is the same as too little demand) require quick action by the grid operator (in this case, PJM was the operator. They are a Regional Transmission Organization or RTO)?

This is because AC (alternating current) alternates between positive and negative voltage and current multiple times per second. In the US and Canada, all of the Interconnects operate at 60 Hz (cycles per second); in Europe the frequency is 50 Hz. Motors and generators that are connected to the grid all need to operate at the grid frequency. This means that gas and steam turbines connected synchronously to the North American grid rotate at the grid frequency. Since frequency is 60 cycles per second, this means the turbines rotate at 3600 rotations per minute.

If the grid experiences deficient supply (the more common event), grid frequency decreases and turbines slow down. The article says, “Concerns about the rapid build-out of data centers often center on the risk that adding too many in a given region could strain electricity supplies, particularly on hot or cold days when demand is high. If demand threatens to exceed supply, grid operators call on power plants to ramp up production and, as a last resort, order utilities to cut power to customers to maintain balance.” As power demand (load) falls due to intentional or unintentional blackouts, frequency increases until balance is restored around 60 Hz.

However, if there’s deficient demand, as described in the article, frequency increases and turbines speed up. To keep the turbines from destroying themselves, various mechanisms (like automatic generation control or AGC) slow them down and frequency decreases. Again, balance is restored around 60 Hz. Thus, neither deficient supply nor deficient demand should cause a serious problem to the grid if the grid operator takes the proper steps to deal with the situation.

However, large clusters of data centers, such as the one in Virginia, introduce a new source of risk. The article continues, “Now, the opposite risk is emerging. Data centers are equipped with technologies that monitor for disturbances on the grid that could cause a power outage and affect operations. When disturbances occur, many data centers automatically shift to backup supplies, severing their grid connections until power quality stabilizes.”

Note that this effect can occur whenever there is a disturbance on the grid, whether due to high or low frequency. In other words, a data center’s monitoring system doesn’t care what is the cause of a disturbance; they switch the data center to backup power in either case. However, the effect of this action, when many data centers do it at the same time, isn’t symmetrical.

If the disturbance is due to deficient supply, this means that demand is too high and frequency is too low. Thus, if a lot of data centers suddenly drop off the grid, that action will push the grid in the direction it needs to go anyway: it will decrease demand and increase frequency.

However, if the disturbance is due to deficient demand/excess supply, this means demand is too low and frequency is too high. Instead of correcting the grid imbalance, disconnecting a lot of data centers from the grid at the same time will reinforce both problems: deficient demand/excess supply and too high frequency. This won’t necessarily lead to a catastrophe, since the grid operator will implement the mechanisms described earlier, which decrease supply and thus lower frequency.

However, the worry is that more and more and larger and larger data centers are appearing all the time. I was told more than two years ago that Microsoft was opening a new data center somewhere in the world every three days. Because new data center construction has only increased since then (and is accelerating), it’s almost certain this pace has increased, not only for Microsoft but for Meta, Google and Amazon.

PJM was able to bring frequency back to normal when 40 data centers simultaneously dropped off its grid (and in an earlier incident when 70 data centers dropped off). But will they be able to do that when both the number and size of data centers have doubled or tripled? The article continues:

In both instances, the loss in data-center demand totaled less than 2,000 megawatts—a substantial amount of power, but not enough to create a crisis for the grid operator, known as PJM Interconnection. PJM had operations in place to quickly reduce the amount of supply on the grid in response to the demand loss.

“It didn’t cause an emergency, but I would say it caused concern,” said Mike Bryson, PJM’s senior vice president of operations. “What we’re worried about is, what if that happens for 3,000 megawatts or 5,000 megawatts?”

The good news is that it seems the authorities, especially FERC (the Federal Energy Regulatory Commission) and NERC (the North American Electric Reliability Corporation) are concerned about this issue and looking into it. But what could be done about this? The only suggestion in the article comes from a statement that Dominion Energy “…has been working with tech companies to determine how such facilities could stay online during brief faults on the system instead of switching to backup.”

This sounds simple, right? All we have to do is figure out how to make sure data centers stay online during “brief faults”. However, that’s like saying “All we have to do to fix the problem with fire alarms going off too often is to raise the temperature threshold at which the alarm goes off.” This involves saying to a homeowner, “We’re going to make your alarm less sensitive, so it will tolerate small fires that would otherwise have set if off.”

Good luck with that.

Tom Alrich’s Blog, too is a reader-supported publication. You can view new posts for two months after they come out by becoming a free subscriber. You can also access all of my 1300 existing posts dating back to 2013, as well as support my work, by becoming a paid subscriber for $30 for one year (and if you feel so inclined, you can donate more than that and/or become a founding subscriber for $100). Whatever you do, please subscribe.

If you would like to comment on what you have read here, I would love to hear from you. Please comment below or email me at [email protected].

[i] This article is probably behind the WSJ’s paywall. If you would like me to send you a PDF of the article, drop me an email.