Chapter 4.8
On-Site Generation: Electrical Integration
Once the strategy decision to self-generate is made, the electrical-integration problem is no longer 'can we make megawatts' but 'can a low-inertia, behind-the-meter plant accept a phase-coherent gigawatt load that steps from idle to peak in milliseconds without tripping' — and the answer is decided by how you parallel the prime movers, where you put the storage, and whether the inverters form the grid or merely follow it.
What you'll decide here
- Whether the plant runs islanded, grid-parallel, or in a transition-capable hybrid — because that single choice sets your inertia budget, your fault-current source, your protection philosophy, and your black-start requirement.
- How you absorb the load transient: by oversizing prime movers for block/step loading, by sizing a BESS as the millisecond-to-minute bridge, or (almost always) by doing both — and where the storage sits in the conversion chain.
- Grid-forming vs grid-following inverters on the BESS, and how much synthetic-or-real inertia (grid-forming BESS, synchronous condensers, flywheels) the islanded bus needs to ride through a generator trip without cascading.
- The paralleling and bus-tie architecture — single bus, double-ended, or ring — and the synchronizing/load-sharing controls that let dozens-to-hundreds of engines or turbines share a phase-coherent AI load.
- The islanding and grid-interconnect transition scheme: open-transition, closed-transition, or seamless grid-forming, and the relaying/commissioning that proves it before a live cluster depends on it (see Chapter 13.4).
Chapter 3.5 made the strategy decision — that for a gigawatt-class campus in 2026, behind-the-meter generation is no longer the diesel-in-the-yard afterthought but the primary power strategy, because the interconnection queue cannot deliver firm megawatts on an AI revenue clock. Chapter 4.9 handles the molecules: getting fuel to the prime movers. This chapter is the electrons in between — the engineering of wiring a privately-owned power plant to a privately-owned supercomputer, where there is no utility on the far side of the bus to hide your mistakes. The grid is an infinite stiff source with seconds of spinning inertia and a fault-current capability measured in tens of kiloamps. An islanded gas plant is none of those things, and the load it must serve is the most electrically hostile in the industrial world.
Three forks dominate everything downstream: island vs grid-parallel, where the transient absorber lives, and grid-forming vs grid-following. Each traces into its own protection, stability, and capital consequences. The recurring theme: an AI cluster is a 100% non-linear, phase-coherent, millisecond-stepping load, and a self-built plant has to be engineered around that fact from the synchronizing relay outward, not retrofitted to it after the first synchronized GPU pulse trips a generator.
Why the AI load breaks the textbook
Classical generator-and-bus design assumes a load that is large relative to no single source, statistically smooth, and forgiving of a few hundred milliseconds of frequency excursion. A training cluster violates all three. Tens of thousands of GPUs execute the same synchronous collective on the same clock, so they ramp from a near-idle floor to well over 150 kW/rack and back in milliseconds, in unison, across the entire hall. To the plant this looks like a load that can swing hundreds of megawatts in a single step, repeatedly, with a sharp di/dt and a power-factor and harmonic signature set by thousands of switch-mode front ends. SemiAnalysis modeled exactly this — gigawatt-scale training load fluctuation as a genuine grid-blackout risk — and it is the reason xAI's Colossus first burned tens of millions of dollars a year running dummy workloads to flatten the profile before deploying over $375M of Tesla Megapacks to do the job properly (DCD / SemiAnalysis, 2025–26).
The consequence for the plant designer is that the prime movers cannot see this load raw. A reciprocating engine can ramp over 100%/minute and reach full load in roughly two minutes; an aeroderivative turbine ramps near 50%/minute with a five-to-ten-minute start (Grid Capacity Intelligence; Data Center Frontier, 2025). Both are geologically slow next to a millisecond GPU step. Something faster than any rotating machine has to stand between the silicon and the engines — which is the whole reason BESS migrated, as Chapter 4.5 details, from 'ride-through for an outage' to 'transient absorber for normal operation.' The plant's job is to present the engines a load they can actually follow, and to present the GPUs a bus stiff enough that they never know how soft the source behind it really is.
Generator sizing, block loading, and the BESS bridge
Sizing an islanded plant is not 'IT load plus a margin.' It is a stack of multipliers that compound: roughly 1.4–1.5x for PUE (the non-IT mechanical and electrical draw), then a redundancy overbuild (N+1 or N+2 at the generation tier), then a hot-climate derate on any turbine, then headroom for the block- and step-loading transient the machines can actually accept. Stack them and the real-world number is steep: Vantage's Shackelford campus provisions ~2.3 GW of generation for a ~1.4 GW IT load — about 64% overbuild — and a hot-climate N+1+1 aeroderivative plant needs on the order of 10–11 units to firm 200 MW (domain research; Power Engineering, 2025). Every multiplier is a fuel-supply, emissions-permit, and capital consequence, which is why the engine-vs-turbine choice in Chapter 4.9 cannot be made independently of this electrical sizing.
Block loading — the largest single load step a generator can accept without unacceptable frequency dip — is the governing transient spec, and it is where the prime-mover technology choice bites. A reciprocating engine accepts a far larger instantaneous block load (as a fraction of rating) than a turbine, which is one reason the RICE fleet dominates the speed-to-power buildout despite needing 200+ units per 500 MW. But no combustion machine accepts a millisecond GPU step. So the canonical 2026 architecture is engines for the bulk firm power, BESS for the transient: the battery sources or sinks the fast di/dt in the first milliseconds-to-seconds, holding bus frequency while the engine governors catch up over the following seconds-to-minutes. NVIDIA's own production-BESS guidance frames the facility battery exactly this way — transient absorption, ride-through, and demand response as three distinct roles on one asset, with closed-loop state-of-charge control so the bridge is always charged enough to take the next pulse (NVIDIA, 2025).
| Placement | Timescale it covers | What it buys | What it costs / fails to do |
|---|---|---|---|
| GPU/rack capacitance (on-board, e.g. ~65 J/GPU GB300, ~400 J/GPU Vera Rubin) | Sub-millisecond to ms | Smooths the sharpest di/dt at the source; cuts the pulse the rest of the chain ever sees | Tiny energy; cannot ride an outage or hold frequency — only shapes the edge |
| Rack BBU / in-rack energy | Milliseconds to seconds | Local ride-through; isolates the rack from short bus sags | Distributed cost and management; not a plant-level frequency tool |
| Facility BESS (grid-forming) | Milliseconds to minutes | The real transient absorber: holds frequency, bridges to genset ramp, black-start, DR | Capex + footprint; SoC must be actively managed or the bridge is empty when the pulse hits |
| Oversized prime movers (block-load headroom) | Seconds to minutes | Governor response and spinning reserve; the only true energy source long-term | Cannot follow a ms step at all; oversizing for transient burns capex and runs engines inefficiently |
The four placements are a layered defense, not alternatives: a serious gigawatt design uses all four, each covering the timescale the layer above cannot. The expensive mistake is assuming one layer covers another's job: sizing the BESS for the genset's seconds-to-minutes energy while forgetting it must also hold frequency in the first cycle; or oversizing engines to chase a transient they physically cannot follow, stranding capex in machines that then idle inefficiently below their minimum-load sweet spot. The correct decomposition is by time constant: on-board capacitance shapes the sub-millisecond edge, the BESS owns milliseconds-to-minutes and the frequency, and the prime movers own the steady-state energy and the spinning reserve.
Paralleling, synchronization, and bus-tie schemes
Firming 500 MW–2 GW from reciprocating engines means paralleling dozens to hundreds of 2–25 MW units onto a common bus — INNIO's 2.3 GW VoltaGrid order is 92 power packs at 25 MW each (INNIO, 2026). Each machine must be synchronized to the live bus before its breaker closes (matching voltage, frequency, phase angle, and phase sequence within the sync-check relay's window) and must then share the load proportionally through droop or isochronous load-sharing controls, so no single engine hunts or reverse-powers. At this unit count the synchronizing and load-sharing controller is the plant's nervous system, not a commodity panel. Its failure modes are catastrophic in a way they never are behind a stiff utility: a mis-synchronized close-in that slams a machine onto the bus out of phase, or a load-sharing loop that oscillates under the GPU pulse.
The bus-tie architecture is the reliability and maintainability fork. A single bus is cheapest and simplest but is a single point of failure and forces a full shutdown to service the switchgear. A double-ended (main-tie-main) bus splits generation across two sections with a normally-open or normally-closed tie, so a fault or maintenance event takes out half, not all — the workhorse for block-redundant AI campuses, and the natural electrical companion to the distributed/block-redundant UPS topology Chapter 4.5 favors. A ring bus gives every source two paths to the load and survives any single bus-section fault without dropping generation, at the price of more breakers and more complex protection coordination. The choice cascades directly into the relaying scheme: more bus sections and ties means more zones, more directional and differential relays, and a harder selectivity study — which is why the protection design (Chapter 4.2) cannot be finalized until the bus-tie topology is frozen.
Microgrid architecture and protection: the low-inertia, low-fault-current problem
An islanded inverter-and-engine microgrid breaks two assumptions that decades of protection practice rest on: abundant inertia and abundant fault current. A utility-fed bus has both — seconds of synchronous spinning mass damping frequency, and tens of kiloamps of fault current that make overcurrent relays trip crisply and selectively. Strip the utility away and replace half the source with inverters, and both collapse. Inverter sources are nearly inertia-less, so frequency can move fast under a step; and they are current-limited, typically to ~1.1–1.5x rated, so a fault on an inverter-dominated bus may not produce enough current to operate a conventional overcurrent relay at all. The protection philosophy must change: away from pure overcurrent grading and toward differential, directional, and fast communication-assisted schemes (e.g. fast bus-differential, line-current-differential over fiber, and source-transfer logic), because you can no longer assume a fault announces itself with a 20x current surge.
This is the deepest electrical reason islanded AI plants now ship with synchronous condensers and flywheels as standard kit rather than optional extras. They restore the two missing ingredients: a spinning synchronous machine contributes real rotational inertia (slowing df/dt so the BESS controls have time to act) and a stiff fault-current source (tens of kiloamps for the cycles needed to make legacy protection operate selectively). ABB's orders for VoltaGrid total 27 synchronous condensers with flywheel booked through 2025 plus 35 more in a March 2026 extension — 62 machines whose entire purpose is to make a low-inertia, inverter-heavy island behave, electrically, more like the grid it replaced (ABB / DCD, 2025–26). The decision fork is real-vs-synthetic inertia: a grid-forming BESS can emulate inertia in firmware and is faster, but a synchronous condenser provides physical inertia and fault current that no inverter current limit can fake — and most serious islanded designs buy both.
| Provider | Inertia | Fault current | Response speed | Trade-off |
|---|---|---|---|---|
| Synchronous condenser (+ flywheel) | Real, physical rotational | High (tens of kA, cycles) | Inherent (no controls) | Rotating asset to maintain; spinning losses; capital + footprint |
| Grid-forming BESS inverter | Synthetic (firmware-emulated) | Limited (~1.1–1.5x rated) | Fastest (sub-cycle) | Inertia is a model, not mass; current-limited under fault; SoC-dependent |
| Flywheel (standalone) | Real, short-duration | Moderate, very brief | Fast (5–30 s real power) | Energy is seconds-only; bridges, does not sustain |
| Spinning prime movers (engines/turbines online) | Real, but machine-dependent | Generator-class | Governor-speed (seconds) | Must run (and burn fuel) to provide it; can't follow ms steps |
Grid-forming vs grid-following: the inverter that decides whether the island stands up
This is the fork that quietly determines whether a behind-the-meter plant can island at all. A grid-following (grid-feeding) inverter is a current source that needs an existing voltage and frequency reference to synchronize to — it follows the grid. Strip the grid away and a fleet of grid-following inverters has nothing to lock onto; they cannot establish a bus by themselves and cannot black-start. A grid-forming inverter is a voltage source: it imposes voltage and frequency on the bus, acts as the reference the other sources follow, can black-start from dead, and can emulate inertia and damping in firmware. The consequence is binary — an islanded or transition-capable plant must have grid-forming capability somewhere (typically on the BESS), or it physically cannot form the island. A grid-parallel-only plant can get away with grid-following because the utility is the reference, which is exactly why the cheaper grid-following hardware proliferated first and why the migration to grid-forming is the defining 2026 inverter-control story.
The capability is now demonstrated at the scale that matters: grid-forming BESS has black-started 200 MW at 275 kV (domain research). But grid-forming at gigawatt scale across multiple inverters and multiple sites introduces a failure mode the grid-following era never had — wide-area control interactions and oscillations between many fast voltage-source controllers, flagged in peer-reviewed 2025 work (arXiv 2508.14318/2508.16457) as a genuine multi-site stability risk. Grid-forming control is a tuned, coordinated control problem where the inverters, the synchronous condensers, and the engine governors must share the frequency-and-voltage job without fighting each other. Standards are converging on this — IEEE 2800 for interconnection of inverter-based resources and UL 1741/IEEE 1547 for the inverter functions — but the islanded-microgrid grid-forming case is still where vendor control firmware, not a published standard, carries the load.
Islanding and grid-interconnect electrical transitions
Whether or not a campus runs permanently islanded, almost all of them want the option to interconnect — to import during planned maintenance, to export or provide grid services for revenue (Chapter 15.8), or simply to keep the utility as a backstop. That makes the point of interconnection (POI) a two-way electrical and regulatory boundary, and the transition across it the riskiest routine event in the plant's life. The electrical machinery is the synchronizing breaker and its check-sync relay at the POI, the anti-islanding / intertie protection the utility mandates (so the plant never energizes a de-energized utility line and endangers line workers), and the transfer scheme that decides open- vs closed- vs seamless transition. Each is a coordination problem with the utility's own relays and its dynamic model of the load — the same model submittals Chapter 4.3 governs — because to the utility, a self-generating gigawatt campus is simultaneously a large load, a potential generator, and a reliability variable it must study.
The regulatory ground under this moved decisively in late 2025. FERC's December 18, 2025 order directs PJM to create three co-location transmission services — Firm, Non-Firm, and Interim Non-Firm Contract Demand — with a transition running to December 2028, the first national template for how a behind-the-meter plant legally relates to the grid it sits beside (domain research). The electrical-integration consequence is that the transition scheme is no longer purely an engineering choice; the tariff class you take (firm vs non-firm grid backup) prescribes how seamlessly and how often you may lean on the utility, which feeds straight back into how much islanded inertia, storage, and black-start you must own outright. A plant that contracts only non-firm grid backup is, electrically, an island most of the time and must be engineered as one.
Deep dive: the synchronized GPU pulse and why dummy workloads were a real (bad) answer
The cleanest illustration of how alien the AI load is comes from xAI's Colossus. A synchronous training run makes every GPU in the cluster execute the same collective at the same moment, so power draw is not a smooth aggregate of many independent loads averaging out — it is one coherent waveform stepping the entire facility's draw up and down together. To the upstream plant this is indistinguishable from someone switching hundreds of megawatts on and off in unison, repeatedly, with a sharp edge. The first mitigation engineers reached for was to run dummy GPU workloads in the troughs to keep total draw flat — and at gigawatt scale the wasted energy summed to tens of millions of dollars a year (SemiAnalysis, 2025). It worked, and it was a terrible answer: you are burning real fuel and real money to manufacture a fake-smooth load profile.
The correct answer is to make the storage do the smoothing, which is why Colossus moved to over $375M of Megapacks: the BESS sinks the pulse peaks and sources the troughs, presenting the plant a flat profile while presenting the GPUs a stiff bus, and burning nothing to do it. Two further levers compound: on-board capacitance (the ~65–400 J/GPU that shapes the sub-millisecond edge before it propagates) and software power-capping / workload-aware scheduling that deliberately desynchronizes or rate-limits the collective's power edge. The design principle that falls out of this — size the absorber by time constant, not by energy alone, and prefer storage over fuel-burning load-flattening — is the single most important electrical-integration lesson the 2024–26 gigawatt buildout produced. The transient physics behind it is engineered in Chapter 4.5; this chapter's job is to place that absorber correctly in an islanded source chain.
Deep dive: a low-inertia island's frequency-stability budget, step by step
Walk the timescales of a single disturbance — say, the largest online generator trips while a training collective is at peak draw — on an islanded bus, and the architecture justifies itself. Sub-cycle to first cycle: the rate of change of frequency (df/dt) is set by the bus inertia; with inverters contributing none, df/dt would be violently steep, which is precisely what the synchronous condensers' physical rotational mass and the flywheels exist to damp, buying the controls time. First few cycles: fault current (if the disturbance is a fault, not a trip) must be high enough for differential and directional protection to discriminate and clear selectively — the sync condensers supply that current; an inverter-only island cannot. Milliseconds to seconds: the grid-forming BESS, as the voltage-source reference, arrests the frequency excursion by injecting or absorbing real power instantly, holding the bus while slower sources respond. Seconds to minutes: the remaining engine governors ramp to pick up the lost generation's share, and the BESS state-of-charge recovers toward the setpoint that keeps it ready for the next pulse.
Every layer covers a window the next cannot. Remove the synchronous condensers and df/dt outruns the BESS controls and protection mis-operates; remove the grid-forming capability and there is no reference to hold the bus; remove the engine reserve and the BESS eventually empties. This is why the islanded GW-scale plant is a stack of inertia, fault-current, fast-power, and energy providers rather than any single hero technology — and why commissioning it (Chapter 13.4) means proving the whole stack against induced disturbances before a live cluster's goodput depends on it.
The density-ramp consequence
The electrical-integration design cannot be frozen against today's rack power, because the load it must serve is ramping 15–25x in four years — H100 ~40 kW, GB200 ~132 kW, GB300 ~142 kW, Vera Rubin ~190–230 kW, and a signposted ~600 kW–1 MW Kyber-class rack by 2027 (SemiAnalysis / NVIDIA). A denser rack does not merely draw more average power; it sharpens the transient — bigger steps, steeper di/dt, more energy per pulse — which pushes the absorber sizing and the grid-forming control bandwidth, not just the conductor ampacity. A plant whose BESS, synchronous-condenser, and paralleling controls were tuned for a 132 kW-rack pulse spectrum may be marginal against a 600 kW-rack one. The irreversible substrate to reserve here is the same logic as the slab and the water loop: BESS pad space and DC-bus headroom, synchronous-condenser foundations, switchgear lineup spares, and control-system bandwidth — the things you cannot retrofit into a live islanded plant without taking the cluster down. Size the firm-power tier to the contracted IT ramp, but size the stability and transient tier to where the pulse spectrum is going, because that is the part that breaks first and is hardest to add later.