The Definitive Guide toAI Data Centers
Ask the Guide

Appendix C

Decision Tables & Calculators

Most of the irreversible decisions in this guide reduce to a handful of arithmetic checks — cost per useful GPU-hour, cost per million tokens, the density that forces liquid, the redundancy tier the workload actually values, and whether a contracted cash flow can carry its debt. This appendix is the live calculator layer plus the static reference tables that anchor it.

What you'll decide here

  1. Use the live Calculators for the numbers that change with your inputs — TCO and $/GPU-hr, inference $/M-tokens, PUE/WUE, and the density→cooling verdict. They run entirely in your browser and nothing you type leaves the page.
  2. Use the static tables on this page for the numbers that do not change much: the density→cooling-cliff map (so you can read the whole curve at once, not one density at a time) and the redundancy-topology selector (which tier a workload justifies).
  3. Treat every default in the calculators as a placeholder, not a recommendation — replace it with your own quoted capex, contracted power price, measured throughput, and real utilization before you trust an output.
  4. Carry the outputs back to the chapter that owns the decision: the TCO/token math to Chapter 1.8, the project-finance ratios to Chapter 2.5, the cooling verdict to Chapters 5.1 and 5.4, and the redundancy choice to Chapter 12.2.
  5. Sanity-check any single calculator result against the dated figures in Appendix D before you put it in a board deck — a back-of-envelope model is only as good as the vintage of the numbers you feed it.

This appendix is realized as a live, interactive tool in the site, not a wall of worked arithmetic. The four calculators — cluster TCO and cost per GPU-hour, inference cost per million tokens, PUE/WUE from measured loads, and the density→cooling-cliff verdict — run in your browser with editable inputs, so the right way to "read" them is to open the tool and put your own numbers in. What lives on this page is the surrounding reference layer: what each calculator computes, the assumptions and caveats baked into its formula, and the static tables that are more useful seen whole than queried one value at a time.

The discipline these tools enforce is the one the rest of the guide keeps returning to: the expensive decisions are not opinions, they are checks. Is this rack density over the air-cooling cliff or under it? Does this workload value the nines a 2N plant buys, or is it checkpoint-tolerant and indifferent? Can a contracted cash flow service its debt at the coverage ratio a lender will accept? Each is a few lines of arithmetic — and getting the arithmetic wrong at scoping time is how facilities end up mismatched to the revenue they were built to earn.

What each calculator computes

The four live calculators
CalculatorInputsReads outKey leverOwning chapter
Cluster TCO & $/GPU-hrGPU count, all-in $/GPU, kW/GPU, PUE, $/kWh, amortization yr, utilization %, opex %All-in cost per useful GPU-hour + capex/energy/opex splitUtilization and amortization life (the depreciation lever)Chapter 1.8
Inference $/M-tokensGPU $/hr, throughput (tok/s/GPU), sell price ($/M-tok)Cost per million tokens + gross margin at your sell priceThroughput (tokens/sec/GPU) — batch, precision, model sizeChapter 10.11
PUE / WUETotal facility kW, IT/critical kW, annual water (m³)PUE and WUE (L/kWh) from measured loadsCooling modality (air vs liquid vs evaporative)Chapter 15.1
Density → cooling cliffRack density (kW/rack)Which cooling modality the density forcesRack density vs the ~41 kW air ceilingChapter 5.1
Each runs client-side on editable inputs. The 'reads out' column is the headline number; the 'key lever' column is the input that moves it most.

The two cost calculators chain together: the $/GPU-hr that falls out of the TCO model is the natural input to the inference token model. Run the cluster TCO first to get a defensible cost basis, then feed that figure (not a rental rate) into the $/M-tokens calculator to see whether your sell price actually clears margin. The PUE/WUE tool is the facility-efficiency counterpart — it takes the same kind of measured loads and returns the two metrics every operator is now reported on. The density tool is the fast pre-check: type your target rack density and it returns, in one line, whether that density commits you to liquid.

Density → cooling-cliff reference map

The single calculator answers one density at a time; this table is the whole curve, so you can see where your roadmap crosses each threshold. The cliff is a discontinuity, not a slope — air saturates near 41 kW/rack and there is no airflow trick that closes a 90 kW gap. The bands below mirror the live tool's verdict logic and the engineering in Chapter 5.1 (the density wall) and Chapter 5.4 (DLC).

Density → cooling-cliff map
Rack densityRegimeCooling modalityWhat it commits you toExample hardware
< 15 kWComfortable airHot/cold-aisle containmentRaised floor or slab; CRAH/in-row; warm ASHRAE A1–A4 supplyLegacy enterprise / general compute
15–40 kWAir at the limitTuned containment, optional rear-door HXAggressive airflow management; RDHx as a bridge, no facility water at rackHGX H100/B200 air-cooled (~40 kW)
40–80 kWThe transition zoneRDHx / air-assisted liquid → DLCRear-door or AALC now; plan the path to direct-to-chipDense inference; bridge retrofits
80–150 kWOver the cliffDirect-to-chip liquid (warm water)Cold plates, in-rack manifolds, CDU, facility water loop, tight delta-TGB200/GB300 NVL72 (~132–142 kW)
> 150 kWBeyond air entirelyDLC mandatory + sidecar power800 VDC sidecar power; reinforced floor; immersion only in nichesRubin VR200 (~190–230 kW); Kyber (~600 kW)
Air ceiling ~41 kW/rack per ASHRAE TC 9.9; bands match the live density calculator. GB200 NVL72 ≈132 kW; Rubin Ultra Kyber ≈600 kW (roadmap). Sources: ASHRAE TC 9.9; SemiAnalysis Datacenter Anatomy; NVIDIA roadmap, 2025–2027.

Redundancy-topology selector

The second decision this appendix anchors is redundancy: how many nines the workload actually values, not how many the building can be made to deliver. Interruption tolerance is the lever. A synchronous training job already restarts from a checkpoint when any node fails, so spending on 2N facility power to prevent a restart is largely wasted; an always-on inference business breaches an SLA and loses revenue on every outage, so it justifies the premium. Use this selector to pick the tier the workload earns, then carry it to Chapter 12.2, which reframes the whole question as goodput versus facility availability.

Redundancy-topology selector
TopologyAvailability (~hr/yr down)Capital postureBest-fit workloadWhy
N (no redundancy)LowestCheapestBatch inference; spot/curtailable trainingInterruption-tolerant: queue-and-retry or checkpoint-and-resume absorbs failures
N+1MidModest premiumPre-training; post-training/RLCheckpoint-tolerant jobs; spend the savings on goodput, not nines
2NHighHigher (redundant chain)Online inference at scale; multi-tenant coloNo single point of failure; an outage is lost revenue and a breached SLO
Tier III (99.982%, ~1.6 hr)~1.6 hr/yrConcurrently maintainableProduction inference; enterprise SLAsService any component without downtime; the common commercial baseline
Tier IV (99.995%, ~26 min)~26 min/yr+20–40% over Tier IIIMission-critical, revenue-on-the-line servingFault-tolerant + concurrently maintainable; the maximum, rarely worth it for training
Availability % are the historically-published Uptime Tier figures (Uptime Institute now disavows specific percentages); Tier IV carries ~20–40% capital premium over Tier III. Source: Uptime Institute, 2025.

Project-finance model: the ratios that gate a deal

The levered-IRR / project-finance model is the companion to Chapter 2.5. Its mechanics are a standard infrastructure pro-forma — build a contracted (or merchant) revenue line, subtract opex to EBITDA, discount the unlevered free cash flow to an NPV and unlevered IRR, then layer the debt structure to solve for levered IRR, with the DSCR (cash available for debt service ÷ debt service) sizing how much leverage the cash flow can carry. The output that matters is not a single IRR point estimate but the sensitivity tornado: which assumption — utilization, power price, GPU residual, depreciation life, contract tenor — moves the answer most. In the 2026 market, the dominant fork is contracted versus merchant, because it sets both the discount rate and the achievable gearing.

Project-finance terms and 2026-typical ranges
TermWhat it measuresContracted (offtake-backed)Merchant (uncontracted)Note
DSCR (min)CFADS ÷ debt service — the coverage cushion~1.20–1.45x~1.75–2.00xRevenue risk sets the floor; hyperscaler-grade leases compress it toward ~1.32x
Gearing (debt %)Debt as share of capital stackHigher (offtake supports leverage)Lower (equity-heavy)Contract quality directly enables debt capacity
Discount rate / WACCHurdle the cash flows must clearLower (investment-grade backstop)Higher (price + offtake risk)Bifurcation of cost of capital is the 2026 market inflection
Unlevered IRR / NPVProject return before financingSet by capex, utilization, priceSame drivers, wider bandThe sensitivity tornado lives here
Levered IRREquity return after debtAmplified by cheap, deep debtThinner debt → less amplificationThe headline equity number; sensitive to rate and depreciation
Coverage and leverage ranges are 2026 practitioner figures; data-center debt is increasingly secured against contracted capacity, not unsecured corporate balance sheets. Sources: GreenBridge Infrastructure; Percepture AI Investor Playbook; Morgan Stanley / JPMorgan issuance estimates, 2026.
~$0.74/GPU-hr
TCO at 2048-GPU scale, 90% utilization; ~$1.03 small clusters (build-up cost, not rental) (contested — single-source)
2025SemiAnalysis, 'How much do GPU clusters really cost'
~$1.90/M tok
self-hosted inference (8x H100 @ ~$19.20/hr, Llama-70B FP16); market avg fell ~$10 → ~$2.50/M in a year
2025Introl / NVIDIA synthesis
~41 kW
practical air-cooling ceiling per rack; RDHx ~50–100 kW; DLC 200+ kW
2025ASHRAE TC 9.9; SemiAnalysis Datacenter Anatomy
~132 kW
GB200 NVL72 rack (~115 kW liquid + ~17 kW air); Kyber roadmap ~600 kW on 800 VDC
2025–2027NVIDIA OCP / Introl; NVIDIA roadmap
PUE 1.05–1.15
direct-to-chip liquid design band (legacy air 1.4–1.6; two-phase immersion 1.01–1.10)
2025SemiAnalysis / Uptime Institute
~70%
breakeven utilization for a debt-financed neocloud cluster; below it the unit economics invert (contested — single-source)
2025AM Compute / McKinsey
1.20–2.00x
minimum DSCR band — ~1.20–1.45x contracted vs ~1.75–2.00x merchant
2026GreenBridge Infrastructure; Percepture
Tier IV ~26 min/yr
published availability (99.995%); ~20–40% capital premium over Tier III (~1.6 hr/yr)
2025Uptime Institute
Assumptions & caveats baked into the calculators

Back-of-envelope models are only as honest as their assumptions. The defaults here are deliberately simple, and a few simplifications matter:

  • TCO is straight-line. The model amortizes capex evenly over the chosen life and applies opex as a flat percentage. It does not model GPU residual value, the depreciation debate (2–3 yr economic vs 5–6 yr book life), or financing cost — those live in the project-finance model and Chapter 1.8. Utilization is the single most sensitive input.
  • Token economics assume steady-state throughput. The $/M-tokens figure is linear in tokens/sec/GPU, which is itself a function of model size, precision (FP16 vs FP8/FP4), batch size, and decode-vs-prefill mix. A reasoning model with long decode sequences will read very differently from the default. → Chapter 10.11.
  • PUE/WUE are instantaneous, not annualized. The tool computes a point ratio from the loads you enter. Real reported PUE/WUE are annualized over a full climate cycle and move with outside-air temperature and load factor. → Chapter 15.1.
  • The density verdict is a modality gate, not a full thermal design. It tells you which regime a density forces; it does not size CDUs, coolant flow, or delta-T. Those are engineered in Chapter 5.4.
  • All figures are vintage-stamped. Hardware, pricing, and rates in 2026 move fast. Cross-check any load-bearing number against the dated register in Appendix D before you rely on it.
The TCO, $/GPU-hr, and $/M-token economics are derived in full in Chapter 1.8, with the procurement fork they price in Chapter 1.6 and metric definitions in Chapter 0.3. The levered-IRR / project-finance model is the companion to Chapter 2.5; insurance and risk transfer sit alongside it in Chapter 2.6. The density→cooling cliff is engineered in Chapter 5.1, the RDHx/AALC bridge in Chapter 5.3, and DLC in Chapter 5.4. Fabric oversubscription (the networking-taxonomy companion) is in Chapter 8.5; inference serving economics in Chapter 10.11. The redundancy selector connects to the goodput-vs-availability rethink in Chapter 12.2 and the quantitative reliability model in Chapter 12.5. Efficiency metrics are in Chapter 15.1. Every figure on this page is dated and sourced in Appendix D.