Chapter 5.1
Thermal Fundamentals & the Density Wall
Heat is the binding physical constraint of an AI data center: every watt you put into a chip must leave it through a stack of thermal resistances that air can no longer clear, so the density of the machine you intend to run silently dictates which cooling regime you are forced into — and the boundary between regimes is a cliff, not a slope.
What you'll decide here
- The peak rack density your facility must clear — not today's accelerator but the one two generations out — because density sets the cooling regime, and the regime sets the slab, the plenum, the water, and the heat-rejection plant before any of them can be changed.
- Which side of the air-cooling cliff (~30–50 kW/rack) your design basis sits on, and therefore whether you are committing to air, a rear-door bridge, or direct-to-chip liquid before steel is cut.
- The junction-to-coolant thermal-resistance budget you are designing against — the chip vendor fixes Tjunction and TDP, leaving you only the coolant temperature and the resistance stack to spend, and that budget decides whether a cold plate can keep the silicon legal.
- The approach temperature and effectiveness you target at every heat exchanger in the chain, because each delta-T you spend narrows the free-cooling window and pushes the facility toward mechanical chilling.
- Whether the irreversible substrate (floor loading, facility water, pipe-rack and CDU space, electrical headroom) is sized for the density ramp, even where the reversible IT fit-out is matched to the current generation.
Every accelerator is, thermodynamically, a space heater that happens to do arithmetic. A 1,200 W GPU converts essentially all of its electrical input into heat, and that heat must be removed continuously and within a few degrees of a fixed temperature limit or the silicon throttles, ages, or fails. This is the one constraint in the building that does not negotiate. You can oversubscribe a fabric, defer a redundancy tier, re-price a power contract — but you cannot argue with the second law. The heat leaves through the path you built for it, at the rate physics allows, or the machine slows down to match the path you actually have.
This chapter lays the thermal foundation for all of Part 5. We start from heat-flux first principles and the thermal-resistance stack-up that connects a transistor junction to the coolant, then show why air hit a wall somewhere around 30–50 kW per rack and why that wall is a discontinuity rather than a gentle ceiling. We trace the density curve from 2020 to 2027 (40 kW H100 racks to ~132 kW GB200 NVL72 to ~600 kW Kyber-class) and map the cooling hierarchy onto it, naming where each technology saturates. We close on the thermal metrics (approach temperature, NTU/effectiveness, the chain of delta-Ts) that the rest of this part uses as working vocabulary. Density is the decision; the cooling regime is its consequence; the cliff between regimes is the most expensive boundary to cross in the wrong direction.
Heat flux and the resistance stack-up
The governing quantity is not power but heat flux — power per unit area, W/cm². A 700 W H100 die spread over roughly 8 cm² runs near 85–90 W/cm²; a Blackwell-class package past 1 kW pushes toward and beyond 100 W/cm² at the hotspot. For scale, that flux rivals a nuclear-reactor fuel rod and exceeds a domestic cooktop element by an order of magnitude. Flux is what the cooling solution actually fights, because heat removal is fundamentally limited by how much surface area you can couple to a coolant and how steep a temperature gradient you can sustain across it.
The chip vendor hands you two fixed numbers and no others. Tjunction-max — the maximum allowable on-die temperature, typically ~90–105 °C for datacenter accelerators — is a hard reliability and functional limit; cross it and the part throttles, then degrades, then fails. TDP — the thermal design power you must remove — is set by the silicon and the workload. Everything between the junction and the coolant is your design space, and it is governed by a simple, unforgiving relation: the temperature rise from coolant to junction equals the heat removed times the thermal resistance of the path, ΔT = Q × Rθ. Fix Q (the TDP) and Tjunction-max, and the resistance you can afford collapses to a fixed budget. Spend it badly and the chip is illegal at any coolant temperature you can practically supply.
The resistance stack is a series chain — Tjunction to Tcase to the cooling medium — and like any series circuit, the largest resistor dominates. Walk it from the silicon outward:
| Stage | Interface | What it is | Why it dominates or doesn't |
|---|---|---|---|
| Junction → case | Rθ-JC (in-package) | Silicon → TIM1 → integrated heat spreader / lid | Largely fixed by the vendor's package; you cannot improve it from outside |
| Case → cold plate | TIM2 / thermal interface | Lid → second thermal interface → cold-plate base | The most abused link; a poor or pumped-out TIM2 silently adds 5–15 °C |
| Cold plate → coolant | Convective resistance | Microchannel / skived-fin base → flowing coolant film | Set by flow rate, channel geometry, and coolant; where DLC wins over air |
| Coolant → facility | Loop ΔT + CDU approach | Technology-cooling loop carries heat to the CDU heat exchanger | Cumulative; every approach temperature here narrows free-cooling headroom |
The reason air lost is visible in the third row. The convective resistance from a surface to a fluid scales with the fluid's heat-transfer coefficient and the wetted area. Water's volumetric heat capacity is roughly 3,500× that of air, and its convective coefficient at a cold-plate surface is one to two orders of magnitude higher than forced air over a finned heatsink. Air can be pushed harder — more CFM, taller fins, colder supply — but each lever has sharply diminishing returns and a parasitic-fan-power penalty that eventually exceeds the heat it removes. Liquid does not so much beat air as operate in a different regime entirely: it shrinks the case-to-coolant resistor by enough that the same TDP fits inside the same junction budget at a far more relaxed coolant temperature. → the cold-plate engineering is in Chapter 5.4; in-chip microchannels that attack Rθ-JC itself are in Chapter 16.2.
Why air hit a wall
Air cooling did not gradually run out of headroom; it hit a wall, and the location of the wall is one of the most consequential numbers in this guide. With aggressive hot/cold-aisle containment, tuned airflow, and cool supply air, a well-engineered hall tops out around 40–50 kW per rack, with ~41 kW a common practitioner reference for the point past which air becomes uneconomic and unreliable (ASHRAE TC 9.9; SemiAnalysis). Some operators push to 50 kW with specialized architecture; many never clear 20 kW in legacy halls. The exact figure is site-specific, but the existence of the ceiling is not.
Three physical limits converge to build the wall. First, fan power scales with the cube of airflow: doubling CFM to chase more heat roughly octuples fan energy, so beyond a point you are spending more electricity moving air than the air removes as useful cooling — a negative-return regime. Second, air's low heat capacity forces large temperature rises and large volumes; the supply-to-return delta-T air can carry is small, and you run out of mass flow before you run out of fans. Third, acoustic and velocity limits cap how hard you can blow before noise, vibration, and bypass airflow make the hall unworkable and the cooling ineffective at the chip. Past the wall, the curve does not bend — it breaks. There is no airflow scheme, no containment trick, no warmer ASHRAE class that closes a 90 kW gap between a 41 kW air ceiling and a 132 kW rack.
This is why the boundary is a cliff, not a slope, and why it is a one-way door in a retrofit. A hall built for air has the wrong floor loading, no plenum or pipe-rack for liquid distribution, insufficient electrical headroom, and frequently no facility water provisioned at all. Crossing the cliff after the fact runs $5–10M/MW and still tends to strand capacity: power you cannot use because cooling caps first, or floor area you cannot fill because the slab cannot bear wet racks. The decision to plumb a hall for liquid is an archetype decision, not a mechanical one, and it must be made before the slab is poured. → the retrofit paths are engineered in Chapter 5.10; air pushed to its honest limit is Chapter 5.2.
The density curve, 2020–2027
The density wall would be an academic curiosity if accelerators had stayed where they were. They did not. Per-GPU thermal design power has climbed from the A100's ~300 W to the H100's 700 W to GB200's ~1.0–1.2 kW, with Rubin and Rubin Ultra projected near ~1.8 kW and ~2.3 kW. Multiply by the GPUs packed into a rack and the rack-level curve is steeper still — because the scale-up domain grew at the same time, concentrating more silicon behind a single liquid manifold. The result is a density ramp that crossed the air-cooling cliff somewhere in the H100-to-GB200 transition and never looked back.
| Generation (year) | Per-GPU TDP | Per-rack draw | Cooling regime forced | Relation to the air cliff |
|---|---|---|---|---|
| A100 / HGX (2020–22) | ~300–400 W | ~10–20 kW | Air; raised floor + containment | Comfortably under the wall |
| H100 / HGX (2023) | ~700 W | ~30–40 kW | Air at the limit; RDHx optional | At the wall; air still wins for many |
| GB200 NVL72 (2024–25) | ~1.0–1.2 kW | ~120–132 kW | Direct-to-chip liquid mandatory | ~3× over the wall; no air path exists |
| GB300 NVL72 (2025) | ~1.4 kW class | ~135–142 kW (up to ~155 kW peak) | DLC; residual air load on RDHx | Well over; hybrid liquid+air per rack |
| Rubin VR200 (2026) | ~1.8 kW | ~190–230 kW | DLC + 800 VDC power path | Far over; warm-water loops to free-cool |
| Rubin Ultra Kyber (2027) | ~2.3 kW | ~600 kW | DLC mandatory; in-chip microfluidics on the roadmap | An order of magnitude over the wall |
The rightmost column is the consequence. Once the density column crosses ~41 kW, the cooling regime is no longer a decision you get to make; the physics has made it for you. The only decisions left are when you cross (which generation your facility targets) and whether the irreversible substrate is ready when you do. A hall scoped for 40 kW air-cooled racks cannot absorb a 132 kW NVL72 generation, let alone a 600 kW Kyber-class rack — not the floor, not the power chain, not the cooling plant. The expensive mistake of the 2026 era is designing to today's density and being surprised by the ramp.
The cooling hierarchy and where each rung saturates
Map the density curve onto the available cooling technologies and you get a hierarchy — a ladder where each rung removes more heat at higher capital cost and integration complexity, and each rung saturates at a density that hands the load to the next. The engineering discipline is to choose the lowest rung that clears your peak density with margin, because every rung up costs money, water, and plumbing complexity you do not get back.
- Air (containment + CRAH/in-row). Saturates ~40–50 kW/rack. Cheapest, simplest, no facility water at the rack. Still the right answer for storage, networking, modest-density inference, and edge. → Chapter 5.2.
- Rear-door heat exchangers / air-assisted liquid. Bridges ~50–100 kW/rack. Captures heat at the rack exhaust with a liquid coil; the brownfield-friendly rung because it needs no chip-level plumbing and tolerates facilities without facility water. Saturates where the door coil can no longer extract a high enough fraction of the heat. → Chapter 5.3.
- Direct-to-chip liquid (single-phase DLC). The 2026 default, ~55% of the liquid-cooling market. Cold plates on the GPUs/CPUs/switches, in-rack manifolds, dripless quick-disconnects, a CDU isolating the technology-cooling loop from facility water. Clears 100 kW to 200+ kW and scales to the Kyber generation with warm-water loops. → Chapter 5.4; CDU loop in Chapter 5.6.
- Immersion (single- and two-phase). Best-in-class PUE, but niche: single-phase wins on serviceability and floor loading, two-phase stalled on the PFAS reckoning and insurability. → Chapter 5.5.
- In-chip / direct-to-silicon microfluidics. The next rung, attacking the in-package Rθ-JC resistor itself with microchannels etched into or onto the die — the only lever that touches the dominant resistance the cold plate cannot reach. Roadmap, not yet default. → Chapter 16.2.
Deep dive: the chain of delta-Ts, approach temperature, and why warm water decides free cooling
Heat does not teleport from the junction to the sky; it walks down a staircase of temperature drops, and every step costs you. SemiAnalysis frames this as the Four Delta-Ts, and it is the right mental model for the entire facility. Start at the junction (~90 °C limit). Drop across the package and TIMs to the cold-plate coolant. Drop again across the loop as the coolant carries heat to the CDU. Drop a third time across the CDU heat exchanger from the technology-cooling loop to the facility-water loop. Drop a fourth time at heat rejection — the cooling tower, dry cooler, or chiller that finally hands the heat to ambient. The junction temperature is fixed; ambient is fixed by your climate and season; everything in between is a budget of degrees you allocate across four exchangers.
The lever at each exchanger is approach temperature — the gap between the two fluids leaving a heat exchanger that never fully equalize. A tighter approach means a more effective (and larger, costlier) exchanger but a warmer achievable supply on the cold side. Formally this is captured by effectiveness and the NTU (number of transfer units) method: effectiveness is the actual heat transferred divided by the thermodynamic maximum, and it rises with NTU, which rises with exchanger surface area and overall conductance. More area buys more effectiveness buys a tighter approach buys warmer facility water for the same junction temperature.
Why does warmer water matter so much? Because it is the difference between free cooling and mechanical chilling. If your facility-water loop can run warm — ASHRAE's W17 through W45-plus classes key cooling water by supply temperature — a dry cooler or tower can reject heat to ambient for most or all of the year, and a PUE near 1.1 is reachable. If your delta-T budget forces cold supply water, you burn compressor energy on chillers, drive PUE up, and shrink your siting envelope to cool climates. Every degree you waste on a sloppy TIM or an under-sized exchanger upstream is a degree you cannot spend on free cooling downstream. This is why the 30 °C-coolant roadmap exists, and why warm-water design is treated as a first-class objective rather than an afterthought. → facility loops and warm-water design in Chapter 5.7; heat rejection in Chapter 5.8; the metric definitions in Chapter 15.1.
Thermal metrics used in this part
Part 5 leans on a small, consistent vocabulary of thermal metrics. Pin them down here so the engineering chapters can use them without re-deriving:
- Approach temperature — the residual gap between the two streams leaving a heat exchanger. Smaller approach, more effective and more expensive exchanger, warmer achievable cold-side supply. The single knob you tune at every exchanger in the chain.
- Effectiveness (ε) and NTU — effectiveness is actual heat transfer over the thermodynamic maximum; NTU is the dimensionless measure of exchanger size (conductance × area over the minimum heat-capacity rate). The ε-NTU method is how you size CDU and facility heat exchangers without solving the full temperature field. Higher NTU asymptotes toward ε = 1 with diminishing returns.
- Delta-T (ΔT) — the temperature rise a coolant carries across a load. A larger ΔT moves the same heat at lower flow (smaller pumps, smaller pipes), which is why warm-water, high-ΔT design is favored. The flow-rate rule of thumb — ~1.2–2.0 L/min per kW — falls directly out of the ΔT you choose.
- Heat flux (W/cm²) — power per die area, the quantity the cold plate actually fights at the hotspot, distinct from total TDP.
The facility-efficiency metrics — PUE, WUE, ITUE, and TUE — sit one level up, scoring the whole plant rather than a single exchanger. PUE is total facility energy over IT energy; WUE is water consumed per IT energy; ITUE and TUE extend the accounting to capture fan and pump parasitics that liquid cooling reshuffles. These are defined canonically in Chapter 15.1 and used throughout Part 5 as the scorecard for the design choices this chapter sets up; we name them here only so the cross-references resolve.