The Definitive Guide toAI Data Centers
Ask the Guide

Chapter 1.8

Business Models, Economics & ROI

An AI data center is a depreciating capital asset whose return is decided by four numbers — capex per watt, the depreciation life you assume, the utilization you actually achieve, and the price you can still charge after the market deflates it — and getting any one of them wrong turns a 'factory' into a stranded balance-sheet liability.

GOODPUTPOWER-BOUND

What you'll decide here

  1. Which depreciation life you underwrite the asset against — the 5–6 year book life that flatters near-term earnings, or the 2–3 year frontier-economic life that the workload actually obeys — because that single assumption is the dominant lever on every TCO, $/GPU-hr, and breakeven number downstream.
  2. Which operating archetype you are (hyperscaler, neocloud, colo/build-to-suit, or self-build) and therefore whose cost-of-capital, utilization risk, and margin structure you inherit.
  3. The utilization you can credibly contract or fill — because below the ~70% debt-financed breakeven the same hardware that prints money at 85% bleeds cash, and the swing is wider than any engineering optimization can recover.
  4. How much of your revenue is contracted/take-or-pay versus merchant/spot — the split that sets your debt capacity, your exposure to the ~10x/yr token-price deflation, and whether a single non-renewal strands the asset.
  5. Which downside you are designed to survive — utilization collapse, a residual-value shock, a rate spike, or a contract non-renewal — and whether the design-for-flexibility you paid for actually hedges it.
One assumption, a ~70% swing Same hardware. Same power. The number that decides whether it pencils out is an accounting choice. Annual all-in cost to run a 1 GW AI fleet (~$38B deployed), by assumed GPU depreciation life $0 $3B $6B $9B $12B $12.7B 3-year life aggressive · contested depreciate fast $8.5B 5-year life common base case $7.0B 7-year life optimistic · slow ~70% swing the single largest lever in the whole model
The depreciation-life assumption swings annual cost by ~70% on the same fleet — the single largest lever in whether a build pencils out.

Every chapter before this one spends money; this is the chapter that decides whether the money comes back. An AI data center is not, financially, a building — it is a depreciating capital asset with a very short fuse, dominated by silicon that loses value faster than almost any industrial equipment ever financed at this scale. The engineering decisions in Parts 5 through 12 set the cost stack; the market decisions in Part 16 set the demand; this chapter is where the two meet in a single objective function: does the asset earn its cost of capital before the workload, the hardware, or the price curve makes it obsolete?

We build the cost stack and the TCO denominator; we confront the depreciation debate that quietly determines whether the whole industry is profitable; we lay out the $/GPU-hr pricing ladder and the breakeven that governs it; we trace inference unit economics down to $/M-tokens and the gross-margin waterfall that the application layer lives or dies on; we score build-vs-own-vs-lease as an NPV with an explicit option value; and we close on the operating archetypes and the downside stress tests. The through-line: most of the decisive numbers in AI-infrastructure economics are contested, and the contested ones are exactly the ones the return depends on. We flag them as we go.

The asset and its cost stack

Start with the denominator. The canonical bottom-up reference is a 1 GW AI data center: roughly $38B total-program capex and ~$8.5B/yr all-in TCO once costs are annualized over their respective asset lives (Epoch AI, 2026). The $38B is the total program — the ~$27.5/W core IT-plus-power-plus-shell stack below plus land, the multi-year build-out, and financing carried to energization. It works out to about $8.5M per MW per year — the single number to carry in your head when someone quotes you a lease or a colo rate, because it is the all-in cost you are implicitly benchmarking against.

The cost stack inverts the intuition of a traditional data center. In a legacy facility the building and power plant dominate; in an AI factory the silicon dominates everything. The split is roughly: IT/servers ~60–64% (about $17.50/W), power + cooling + electrical ~29–30% (about $7–10/W), and the building shell ~7% (about $1.90/W) — a core capital intensity near $27.90/W (≈ $17.50 + $8.50 + $1.90) for the IT-plus-power-plus-shell scope of an AI-optimized build, before land, multi-year build-out, and financing (Epoch AI; domain synthesis, 2026). The consequence of a server-dominated stack is profound: the asset's economic life is the GPU's economic life, not the concrete's. You can amortize a shell over thirty years; you cannot amortize a frontier accelerator over thirty years, and pretending otherwise is the original sin of AI-infrastructure accounting.

The costing denominator matters as much as the numerator. Quote a facility in $/MW-year and you are comparing real estate; quote it in $/GPU-hour and you are comparing compute supply; quote it in $/M-tokens and you are comparing the product the customer actually buys. The three denominators are linked by utilization and by tokens-per-GPU-second, and a number that looks competitive in one can be uncompetitive in another. Naming the denominator is the first act of honesty in any AI-infrastructure pro-forma. → metric definitions in Chapter 0.3.

GPU economics, depreciation and obsolescence

This is the canonical home for the depreciation debate, because it is where the contested figures do the most damage if mis-set. Begin with the unit. An 8-GPU H100 SXM server runs ~$250k–$400k (~$27k–$40k/GPU all-in). Depreciation on a $300k server is ~$50k/yr over six years, ~$60k/yr over five, ~$75k/yr over four — and that one line is the largest single component of a self-operated cluster's cost. The depreciation schedule is therefore not a footnote; it is the cost structure.

The bull case — the reason a 5–6 year book life is defensible — is the training-to-inference cascade. A GPU retired from frontier pre-training is not scrap; it cascades down to post-training, then to inference serving, then to batch and internal workloads, earning revenue at each step. If the cascade holds, the economic life stretches toward the book life and the accounting is honest. The bear case is that the cascade is finite (there is only so much inference demand for a two-generations-old part), that each new generation is so much more efficient per token that the old part is uneconomic to run against grid power, and that the residual market is thin. Both can be partly true at once.

The residual evidence is genuinely mixed, which is why it is CONTESTED. H100s retained ~60–83% of value at 18 months, but secondary rental rates fell 64–75% from their $8–10/hr peak, and the implied residual after three years is ~20–40% (Hashrate Index; CNBC synthesis, 2025). A residual that high underwrites the cascade defense; a residual that low validates the short-life bears. The hyperscalers themselves disagree in public: Meta extended server life from 4.0 to 5.5 years (+$2.9B income); Amazon went the other way, 6 to 5 years (−$700M), in the same window (company filings, 2025). When the largest operators move depreciation in opposite directions, no outside party should pretend the number is settled.

Deep dive: the Burry thesis and why understated depreciation is a systemic question, not a stock pick

The sharpest version of the bear case is the claim that the industry is systematically under-depreciating its AI fleet — booking long lives to flatter earnings while the assets decay on the short schedule. The most-cited estimate puts ~$176B of understated depreciation across 2026–2028 for the major operators, against an industry AI-asset D&A line approaching ~$400B/yr (Michael Burry / secondary analyses, 2025–2026). The mechanism is simple accounting: every year you extend the assumed useful life, you move cost off the current income statement, so reported operating margin rises even though nothing about the physical asset improved.

Why this matters beyond a single short position: depreciation policy is the hinge between two completely different pictures of AI-infrastructure profitability. On the long life, the build-out is a high-margin growth story. On the short life, a large share of current 'earnings' is borrowed from a future write-down. The honest engineering-economics posture is not to pick a side but to model the cash flows on the economic life and the reported earnings on the book life, and watch the gap — because the gap is where stranded-asset risk hides. This is CONTESTED and the figures bind to the dated forecast register. → Appendix D; macro framing in Chapter 16.4.

Pricing, utilization and revenue management

Cost is half the equation; the other half is what you can charge and how full you keep the asset. The $/GPU-hour ladder in 2026 spans nearly an order of magnitude for the same H100: a ~$1.03/hr spot floor, a neocloud median ~$2.29–3.50/hr, AWS on-demand ~$6.88/hr, and Azure ~$12.29/hr (SemiAnalysis H100 Index; AM Compute, 2026). Neoclouds price 40–85% below the hyperscalers because they sell raw capacity without the managed-services envelope. The ladder is not static: the 1-year contract index rose ~+40% from October 2025 to March 2026 as supply tightened, a reminder that GPU pricing is a commodity market with real cycles, not a SaaS price list.

Against that revenue ladder sits the cost the operator actually carries. A self-operated build at scale lands near ~$0.74/GPU-hr at 2048-GPU scale and 90% utilization, rising to ~$1.03/hr for small clusters (SemiAnalysis, 2025). The spread between that cost and the rental ladder is the gross margin — but only if the asset is full. Utilization is the silent variable that dominates the whole pro-forma.

Revenue management therefore reduces to filling the asset above the cliff and tiering the fill by value. Revenue per GW of AI capacity runs ~$10–12B/GW/yr (SemiAnalysis, 2025 — a contested, single-source figure), which is why speed-to-power has direct dollar value: energizing 200 MW six months early is worth roughly $1–1.2B in incremental revenue against a depreciation clock that is already running. The revenue-per-MW you can actually realize tiers by archetype — interactive inference at a latency premium, batch at a spot discount — and the mix you contract determines whether you sit comfortably above breakeven or hope for it.

The $/GPU-hour ladder vs self-operated cost (H100-class, 2026)
Supply channelPrice / cost ($/GPU-hr)What it includesImplied posture
Spot floor~$1.03Bare capacity, interruptible, no SLABelow or at self-op cost — a buyer's market signal
Neocloud median~$2.29–3.50Reserved capacity, basic SLA, fast time-to-job40–85% under hyperscaler; the volume tier
AWS on-demand~$6.88Managed, integrated, enterprise SLAConvenience and trust premium
Azure on-demand~$12.29Managed, integrated, enterprise SLATop of the ladder; managed-services envelope
Self-operated cost (2048 GPU @ 90%)~$0.74Your own all-in TCO, excludes marginThe cost you must beat to justify building
Self-operated cost (small cluster)~$1.03Sub-scale all-in TCOScale penalty erases much of the build advantage
Rental rates: SemiAnalysis H100 Index / AM Compute, 2026. Self-operated cost: SemiAnalysis TCO build-up, 2025. Margin shown only where the operator owns the asset; rental rows are what a tenant pays, not a margin.

Inference revenue and unit economics

This is the canonical home for inference unit economics, because inference is now ~2/3 of AI compute and the workload most operators actually monetize. The build-up runs from physics to price: tokens/GPU-second → $/GPU-hour → $/M-tokens. A worked example: an 8x H100 node at ~$19.20/hr serving Llama-70B at ~2,800 tokens/sec lands near ~$1.90 per million tokens self-hosted (Introl / NVIDIA synthesis, 2025) — though the number is brutally sensitive to model size, precision, and batch efficiency. The same hardware can swing the cost several-fold depending on how well you batch and how long the decode sequences run.

The application layer that sits on top earns a gross margin that is structurally worse than software's, and this is the figure most often missed in AI business plans. AI-app gross margins run ~41% rising toward ~52% in 2026 (application-layer specifically nearer 45%), against traditional SaaS at 70–90% (ICONIQ State of AI 2026; Bessemer, 2026). Inference COGS averages ~23% of revenue at scaling-stage AI companies — for every $1M of AI product revenue, roughly $230k is consumed by inference. The gross-margin waterfall is: list price, minus token COGS, minus the inevitable free-tier and retry overhead, minus the cost of the long decode sequences that reasoning models emit. Every layer of that waterfall is under pressure from the layer below it.

Build vs own vs lease: the NPV and the option value

Chapter 1.6 framed the procurement fork qualitatively; this is its quantitative home. The benchmark to anchor against is wholesale colocation: ~$217/kW-month global average in 2025 (Ashburn ~$215 at record highs; range ~$120 in Atlanta to ~$250 in Silicon Valley, up to ~$450 in Singapore), with build-to-suit / credit-tenant leases at ~$150–220/kW-month over 15 years and vacancy near 1% (JLL / CBRE, 2025). Convert $/kW-month to $/MW-year and compare it against the ~$8.5M/MW-year all-in TCO of a self-build: leasing trades a higher steady-state unit cost for capex-light speed and — critically — optionality under demand uncertainty.

The NPV comparison is necessary but not sufficient, because a flat DCF understates the value of being able to change your mind. When demand is uncertain — the normal state in 2026 — a lease is a real option: the right, not the obligation, to continue holding capacity, exercisable as demand resolves. A self-build forecloses that option; you own the megawatts whether or not the workload materializes. The correct comparison prices the option premium: how much extra $/MW-year is the exit/flex right worth, given your demand variance? For a durable, well-forecast workload at scale the option is nearly worthless and the build wins on unit cost. For a spiky or uncertain workload the option dominates and leasing or renting wins even at a higher headline rate. The fork is not build-vs-lease in the abstract — it is how confident is your demand forecast, expressed as an option price.

Build vs own vs lease vs rent — the quantitative scorecard
ModeUnit cost (steady state)Capital intensityTime-to-powerOption value under uncertainty
Self-build (own)Lowest at scale (~$8.5M/MW-yr all-in)Highest (full capex)24–36 monthsLowest — you own the MW regardless of demand
Build-to-suit lease~$150–220/kW-mo (~$1.8–2.6M/MW-yr)Capex-light (lease)12–24 monthsModerate — long term limits exit
Wholesale colo~$217/kW-mo avg (~$2.6M/MW-yr)Capex-light (lease + IT)6–12 monthsHigh — shorter terms preserve exit
Neocloud / rental~$2.29–3.50/GPU-hr (opex)Opex onlyDays to weeksHighest — pure pay-as-you-go optionality
Self-build TCO ~$8.5M/MW-yr (Epoch AI). Colo benchmarks JLL/CBRE 2025. Neocloud rates SemiAnalysis/AM Compute 2026. 'Option value' is the real-option premium under demand uncertainty, not a dollar figure.

Financing strategy: why the capital structure shapes the asset

How you finance the asset changes what you can build and what survives a downturn — the deal mechanics live in Chapter 2.5, but the strategic logic belongs here because it feeds straight into the ROI scorecard. The defining feature of the 2026 build-out is that it has outgrown self-funding: against a multi-year build estimated near $2.9T (2025–2028) with a ~$1.5T financing gap beyond hyperscaler cash flow (Morgan Stanley, 2025), the market reached for GPU-collateralized debt, delayed-draw term loans (DDTLs), bankruptcy-remote SPVs, and asset-backed securitization. ABS issuance ran ~$27B in 2025 and is projected toward $30–40B/yr in 2026–2027.

The strategic catch is that the collateral is the very asset whose value is contested. GPU-backed lending underwrites a depreciating, deflating asset against a thin secondary market — the same residual-value uncertainty from the depreciation debate, now wired into the capital structure. CoreWeave is the visible test case: FY25 revenue $5.13B and 60% adjusted-EBITDA margin, but a −$1.17B net loss, ~$21–25B of debt, interest near 46% of EBITDA, and a ~$66.8B backlog (~13x revenue) concentrated in a few anchor tenants (company filings, 2026). The 'circular financing' critique — vendor stakes and residual backstops that let a buyer finance the purchase of the vendor's own chips — is a real structural risk, not a talking point: it couples the financing to the same demand and residual assumptions the equipment depends on, so a residual shock hits collateral, covenants, and revenue at once.

~$38B / ~$8.5B/yr
1 GW AI data center: total-program capex (core stack ~$27.9/W plus land, build-out, financing) and all-in annual TCO (~$8.5M/MW-yr)
the all-in check a single gigawatt writes — the scale that decides if you can play
2026Epoch AI, AI datacenter cost breakdown
$12B / $8.5B / $7B
1 GW annual TCO at 3-yr / 5-yr / 7-yr IT useful life — the dominant lever
your single biggest cost swings ~70% on one accounting assumption nobody agrees on
2025Epoch AI / AM Compute synthesis
~$0.74/GPU-hr
self-operated TCO at 2048-GPU scale, 90% util; ~$1.03 small clusters (contested — single-source)
the cost floor that decides whether owning beats renting at your utilization
2025SemiAnalysis, GPU cluster cost
~70%
breakeven utilization (debt-financed); 1,024-GPU cluster swings -$330k to +$340k/mo (contested — single-source)
below this the same hardware that prints money bleeds cash
2025AM Compute / McKinsey
~10x/yr
LLMflation: inference cost decline at fixed quality (Epoch: ~50x/yr median)
the price you can charge collapses yearly — today's pricing won't survive a refresh
2024-2026a16z; Epoch AI
~41% to ~52%
AI-app gross margin vs 70-90% for mature SaaS
compute cost means AI apps may never earn SaaS margins — a thinner valuation case
2026ICONIQ State of AI 2026; Bessemer
~$217/kW-mo
wholesale colo global avg 2025; BTS/CTL ~$150-220/kW-mo over 15 yr
the lease rate that sets your fixed cost if you rent the building instead of owning it
2025JLL / CBRE synthesis
~$176B
estimated understated AI D&A 2026-2028 (CONTESTED); industry AI D&A ~$400B/yr
if true, industry profits are overstated this much — a sector-wide earnings risk
2026Burry / secondary analyses; filings

The four operating archetypes

The same physical asset earns very different returns depending on who operates it, because each archetype inherits a different cost-of-capital, utilization risk, and margin structure. They are four distinct business models that happen to share a bill of materials.

The hyperscaler finances from operating cash flow at the lowest cost of capital, fills the asset with its own first-party demand (search, ads, cloud, internal training), and treats the data center as cost-of-revenue for a far larger product. Utilization risk is low because demand is captive; the depreciation policy is the visible lever, which is why hyperscaler life-extensions move billions of reported income. The neocloud is the opposite: thin margins, high leverage, GPU-backed debt, and acute exposure to the ~70% breakeven and to tenant concentration — a high-beta bet on sustained GPU demand. The colo / build-to-suit operator sells powered shells and steady $/kW-month rent, carries real-estate-like risk and real-estate-like cost of capital, and is largely insulated from GPU obsolescence because the tenant owns the silicon. The self-build enterprise/lab optimizes for control and long-run unit cost on a durable, well-forecast workload, accepting the deepest capital commitment and the full obsolescence risk in exchange.

Operating archetype → economic structure
ArchetypeCost of capitalUtilization riskMargin structureObsolescence exposure
HyperscalerLowest (op cash flow)Low — captive first-party demandCost-of-revenue for a larger productHigh but absorbed; depreciation policy is the lever
Neocloud / GPU cloudHigh (GPU-backed debt)High — merchant demand, tenant concentrationThin, leveraged, ~70% breakevenHighest — owns silicon, sells hours
Colo / build-to-suitReal-estate-likeLow-moderate — long leasesSteady $/kW-mo rentLow — tenant owns the GPUs
Self-build (enterprise/lab)Corporate/project financeSelf-imposed — own workloadInternal cost; lowest unit cost at scaleFull — owns and runs to economic end-of-life
Synthesis of McKinsey neocloud, AM Compute, JLL, and company filings, 2025-2026. 'Obsolescence exposure' = who carries GPU residual-value risk.

Protecting ROI and stress-testing the downside

Protecting the return is a small set of levers, each of which maps to a downside it hedges. The power-cost lever is the largest controllable opex line — energy is ~$0.6B/yr in the 1 GW model — so a cheap, firm, long-dated PPA or on-site generation is worth more to lifetime ROI than most capex optimizations. Depreciation policy is the lever that decides whether reported margin reflects reality; the conservative choice protects against a residual shock at the cost of near-term earnings. Design-for-flexibility — reserving floor loading, water, and electrical headroom for a density ramp, and keeping procurement mode hybrid — is the lever that hedges workload and generation uncertainty. The ROI scorecard ties them together: levered IRR, DSCR, payback against economic (not book) life, and the contracted-vs-merchant revenue split that sets debt capacity.

Then stress the downside, because the asset's fragility lives in the tails:

  • Utilization collapse. The dominant risk. Falling from 85% to 55% utilization flips a profitable cluster to a cash-burning one ($670k/month swing on 1,024 GPUs). The hedge is contracted/take-or-pay revenue; the failure mode is a merchant fleet into a soft GPU-rental market.
  • Residual-value shock. If three-year residuals collapse from ~40% toward the low end, GPU-backed debt is under-collateralized and the short-life depreciation bears are vindicated. The hedge is conservative depreciation and limited leverage; the failure mode is circular financing against an optimistic residual.
  • Rate spike. Highly-levered builds (interest already ~46% of EBITDA at the visible neocloud) are acutely rate-sensitive; a financing-cost spike can exceed the entire margin. The hedge is fixed-rate, long-dated debt and a contracted revenue base.
  • Contract non-renewal. Backlogs concentrated in a few anchor tenants (~13x revenue at the visible case) mean one non-renewal can strand a campus. The hedge is tenant diversification and take-or-pay with real termination economics.
  • The secondary-market-depth (Burry) thesis. The whole defense of the long life rests on a deep, liquid secondary GPU market that can absorb cascaded hardware at a stable residual. If that market is thin, the cascade is a story rather than a cash flow, and a large slice of reported industry earnings is borrowed from a future write-down. This is the systemic version of the residual shock.
Deep dive: why design-for-flexibility is the cheapest downside hedge you can buy

Most of the downside cases above are expensive to hedge after the fact and cheap to hedge at scoping time — which is the entire argument for spending an option premium early. A merchant operator cannot manufacture a take-or-pay contract once utilization has already collapsed; but it can, at design time, keep its procurement mode hybrid (a colo anchor plus neocloud overflow) so that a demand miss sheds opex instead of stranding capex. An operator cannot retrofit a soft residual market; but it can choose a conservative depreciation life up front so the balance sheet already assumes the bear case. And an operator cannot re-pour a slab for a denser generation mid-life; but it can, per Chapter 1.1, reserve the floor loading, water, and electrical headroom that let the asset absorb a 5x density jump (Hopper ~40 kW → Blackwell ~130 kW → Rubin Ultra ~600 kW), capturing the ~5x revenue-per-MW of a generation step instead of being stranded one generation behind.

The unifying principle: the downside cases are correlated — a demand miss tends to arrive with a residual shock and a financing squeeze at the same time, because they share the same underlying cause (AI demand resolving lower than the build assumed). Flexibility is valuable precisely because it is the one hedge that pays off across all of the correlated tails at once: it lets you shrink, defer, or re-mix the asset rather than carry a fixed cost into a falling market. Price the flexibility premium against the joint probability of the tails, not each one in isolation. → structural scenarios in Chapter 16.5.

The procurement fork this chapter prices is framed in Chapter 1.6 and the workload archetypes that set the cost stack in Chapter 1.1. Metric definitions and costing denominators are in Chapter 0.3. The deal mechanics behind the financing strategy — SPVs, DDTLs, securitization, the underwriting model — live in Chapter 2.5. Inference serving economics are engineered in Chapter 10.11 and billed in Chapter 10.9. Refresh, depreciation execution, and decommissioning are Chapter 14.9. The sector-macro altitude is Chapter 16.4 and the 2030 scenarios Chapter 16.5. The contested figures bind to the dated forecast register in Appendix D; the levered-IRR and $/M-token calculators are in Appendix C.