Appendix B

Reference Designs & Worked Examples

A reference design is the cascade made arithmetic: pick the archetype, the accelerator generation, and the scalable unit, and the megawatts, liters-per-minute, fiber strands, switch ports, and dollars all fall out of a small set of multipliers you can carry in your head — this appendix supplies those multipliers and works three reference builds end-to-end (a scalable-unit budget, a 50 MW campus, and a 100k-GPU cluster BOM) so you can sanity-check any vendor's sizing in an afternoon.

What you'll decide here

Start from the per-archetype design-basis sheet that matches your dominant workload — it fixes density tier, cooling modality, fabric blocking, redundancy, and the GPU:CPU/storage/network ratios that every later number multiplies against.
Treat the scalable unit (SU) as the atomic costing and deployment block: size the SU once from the budget table, then multiply — campuses and clusters in this appendix are all integer counts of SUs, not bespoke arithmetic.
Use the 50 MW campus and 100k-GPU BOM as order-of-magnitude calibrators, not bids: counts are exact from the ratios; dollar figures are 2025–2026 list/street ranges that move quarterly and must be re-quoted.
When a vendor proposal disagrees with these tables by more than ~20% on a count (racks, CDUs, switches, optics), find out why before you sign — the divergence is usually a hidden oversubscription, redundancy, or generation assumption.
Re-derive, do not interpolate, when you change generation (GB200 → GB300 → VR200 → Kyber): density, flow, and busbar current step discontinuously, so the multipliers in §2 are generation-stamped on purpose.

This appendix is the reusable arithmetic layer behind Part 1's archetype framework (Chapter 1.1) and Part 1's requirements matrix (Chapter 1.7). It does four things, in order: (1) a per-archetype design-basis sheet that freezes the inputs every later number inherits; (2) a scalable-unit (SU) budget giving the power, cooling, water, and network draw of one atomic deployment block per accelerator generation; (3) a 50 MW campus sized from the SU up, with the power chain, cooling plant, and water loop derived; and (4) a 100k-GPU cluster reference BOM with counts and rough 2025–2026 costs for GPUs, racks, CDUs, switches, optics, and storage.

The discipline throughout is multiplier-first. A reference design is a chain of ratios: GPUs per rack, racks per SU, kW per rack, L/min per kW, NICs per node, optics per NIC, GB/s per GPU of storage. Once those are pinned, every aggregate is a multiplication you can audit. Counts in the tables are exact arithmetic from the stated ratios. Dollar figures are 2025–2026 street/list ranges (sources stamped inline); they drift quarterly and are calibration aids, not quotes. Density, flow, and current figures are generation-stamped because they step discontinuously across GB200 → GB300 → Vera Rubin VR200 → Rubin Ultra Kyber; do not interpolate across a generation boundary.

1. Per-archetype design-basis sheets

The design-basis sheet is the single page that everything downstream inherits — it is the concrete instantiation of the workload-profile and design-basis artifacts named in Chapter 1.1. Read it as: choose the row that matches your dominant archetype, and the rest of the appendix is parameterized for you. The three reference builds in §3–§4 use the frontier-training column unless noted, because it is the most constraining; an inference-shaped build relaxes density, fabric, and redundancy and is cheaper on every axis.

Design-basis sheet by workload archetype (2026 reference points)

Parameter	Frontier training	Post-training / RL	Online inference	Batch inference	Edge inference
Dominant accelerator	GB200/GB300 NVL72	Disaggregated: NVL72 trainer + HGX rollout	HGX B200/B300; GB200 for MoE	HGX B200; prior-gen acceptable	L4/L40S, Jetson, single B200
Rack density (design)	120–142 kW	Mixed: 132 kW trainer / 40–60 kW rollout	40–60 kW	30–60 kW	3–30 kW per micro-site
Cooling modality	DLC mandatory, warm-water	DLC trainer + RDHx/air rollout	Air, RDHx, or DLC by density	Air often sufficient	Air / sealed modular
Scale-up domain	72 GPUs (→144, →576)	72 trainer / 8 rollout	8–72 (MoE wants 72)	8	1 (single node)
Scale-out fabric	1:1 non-blocking, 8-rail	Tight trainer / oversub rollout	2:1–3:1 oversubscribed	Heavily oversubscribed	Minimal; WAN backhaul
Fabric transport	InfiniBand XDR or Spectrum-X	IB trainer / RoCE rollout	Ethernet/RoCE common	Ethernet, cost-optimized	Standard IP
GPU:CPU ratio	2:1 (NVL72: 72G:36C)	2:1 trainer / 4–8:1 rollout	4:1–8:1	8:1+	1:1 appliance
GPU:storage (BW)	~250–400 GB/s per 1,024 GPU	Trainer like training	KV-cache + model load tier	Streaming object tier	Local NVMe only
Redundancy basis	N / N+1 (checkpointable)	N+1, staleness-tolerant	2N / Tier-IV-class + N+1 cooling	N (queue-and-retry)	N; fleet geo-redundancy
EDPp sizing factor	~1.4–1.5× TDP	~1.4× trainer	~1.3× TDP	~1.2× TDP	~1.2× TDP
Siting driver	Cheap firm MW + cold climate	Follows dominant sub-workload	Sub-50 ms to users	Cheapest / curtailable MW	Latency budget (30/50/100 ms)

Density and fabric figures are GB200/GB300 NVL72-class, 2026-current. GPU:CPU and GPU:storage are design ratios, not hard limits. See keynumbers for sources and vintages.

2. The scalable unit (SU): power / cooling / water / network budget

The scalable unit is the atomic deployment and costing block — order it, integrate it at the factory (L11/L12), ship it, energize it, repeat. Sizing the SU once and then multiplying is what makes campus and cluster arithmetic tractable. We anchor the SU to the NVIDIA DGX SuperPOD GB200 reference: 8 × NVL72 racks = 576 GPUs per SU, with the full SuperPOD at 16 SUs (128 racks, 9,216 GPUs). The budget below gives one SU's draw across four generations; later sections count SUs, not racks.

The cross-generation columns exist because the multipliers step. A GB200 SU is ~1.06 MW of IT; the same 8-rack SU at Kyber density (~600 kW/rack) is ~4.8 MW — a 4.5× jump in the same floor footprint. That is the density-ramp trap from Chapter 1.1 rendered as a budget line: the floor, water, and busbar you reserve today must survive it.

Scalable-unit budget — 8-rack SU (576 GPUs), by accelerator generation

Metric	GB200 NVL72 (2025)	GB300 NVL72 (2025–26)	VR200 NVL144 (H2 2026)	Kyber NVL576 (H2 2027)
GPUs per SU	576	576	1,152 (144/rack)	4,608 (576/rack)
Rack density (TDP)	132 kW	142 kW	~200 kW	~600 kW
IT power per SU (TDP)	~1.06 MW	~1.14 MW	~1.60 MW	~4.80 MW
IT power per SU (EDPp ~1.4×)	~1.48 MW	~1.59 MW	~2.24 MW	~6.7 MW (smoothed ~30%)
DLC heat to liquid (~87%)	~0.92 MW	~0.99 MW	~1.74 MW (100% liquid)	~4.8 MW (100% liquid)
Residual air heat	~0.14 MW (~17 kW/rack)	~0.15 MW	~0 (100% liquid)	~0 (100% liquid)
Secondary-loop flow (~1.5 L/min·kW)	~1,580 L/min	~1,700 L/min	~2,400 L/min	~7,200 L/min
Coolant inlet / ΔT target	<25 °C / <10 °C	<25 °C / ~10 °C	~45 °C warm-water	~45 °C warm-water
Back-end NICs (8× 400G/node)	144 NICs (3.2 Tb/s/node)	144 NICs	288 (CX-9 800G)	1,152 (CX-9/CPO)
Leaf switch ports consumed (back-end)	576 (1 port/GPU rail)	576	1,152	4,608
Back-end optics (transceivers, 1:1)	~1,152 (NIC+leaf ends)	~1,152	~2,304	~9,216 (CPO shifts mix)
Storage BW attributable (~250 GB/s/1,024 GPU)	~140 GB/s	~140 GB/s	~280 GB/s	~1,125 GB/s

SU = 8 NVL72-class racks = 576 GPUs (DGX SuperPOD GB200 SU definition). Water flow at ~1.5 L/min per kW DLC heat; facility-water make-up assumes evaporative rejection (zero for closed-loop). Optics/switch counts are the 1:1 non-blocking back-end share attributable to one SU.

Worked example: deriving one GB200 SU line-by-line

Take the GB200 column and walk it forward so the multipliers are explicit. GPUs: 8 racks × 72 GPUs = 576. IT power (TDP): 8 × 132 kW = 1.056 MW ≈ 1.06 MW. EDPp: 1.06 MW × 1.4 ≈ 1.48 MW provisioned on the rack power chain. Heat split: at ~115 kW liquid + ~17 kW air per rack, liquid carries 8 × 115 = 920 kW and air carries 8 × 17 = 136 kW. Flow: 920 kW × ~1.5 L/min·kW ≈ 1,380–1,580 L/min secondary loop (≈ 130–200 LPM/rack, matching NVL72 CDU sizing). NICs: 8 racks × 18 compute nodes... note NVL72 presents 72 GPUs across 18 trays; a rail-optimized back-end gives 8× 400G per GPU-pair node → ~144 NICs/SU at 3.2 Tb/s/node. Optics: the back-end optic count tracks GPU rails, not NIC bodies — each of the 576 GPUs drives one 1:1 rail link, and each link burns two transceivers (server/NIC end + leaf end), so 576 GPUs × 2 ≈ ~1,152 rail-side optics for the SU's share of the non-blocking fabric (before spine). These are the only numbers; everything in §3–§4 is integer multiples of them. → SU definition in Chapter 1.7; fabric sizing in Chapter 8.5.

3. Worked example: a 50 MW campus sized from the SU up

Now multiply. The brief: a 50 MW IT frontier-training campus on GB200/GB300 NVL72, built as integer SUs, with the power chain, cooling plant, and water loop derived. We size on TDP for the IT budget and EDPp for the electrical chain, and we reserve floor/water/busbar headroom for a GB300 → VR200 density step (the irreversible substrate from Chapter 1.1).

SU count. 50 MW IT ÷ ~1.06 MW/SU ≈ 47 SUs. Round to 48 SUs (a clean 3 × 16-SU SuperPOD-scale halls, or 6 × 8-SU halls). That is 48 × 8 = 384 NVL72 racks and 48 × 576 = 27,648 GPUs. Total IT at 132 kW/rack = 50.7 MW — within the 50 MW brief at design margin.

50 MW campus — derived from 48 SUs (384 racks, 27,648 GB200 GPUs)

Subsystem	Sizing basis	Quantity / value
Scalable units	50 MW ÷ 1.06 MW/SU	48 SUs
NVL72 racks	48 × 8	384 racks
GPUs	384 × 72	27,648 GPUs
IT power (TDP)	384 × 132 kW	50.7 MW
Facility power (PUE ≈ 1.2)	50.7 MW × 1.2	~60.8 MW
Utility interconnect (N, +margin)	~61 MW × 1.15	~70 MW POI / 2× 132 kV feeders
Main transformers	≥2 × 75 MVA (N+1 at MV)	2–3 × 75 MVA
MV distribution	33/13.8 kV ring or radial	per-hall 13.8 kV → 415 V / 800 VDC
UPS / ride-through (EDPp)	50.7 MW × 1.4 EDPp	~71 MW transient basis; BESS + rack BBU
Backup generation	N (gas RICE / turbine for island)	~65–75 MW prime/standby
DLC heat to facility water	384 × 115 kW	~44 MW thermal
CDUs (L2L, ~1.4 MW each, N+1)	44 MW ÷ ~1.3 MW useful + N+1	~36–40 CDUs (≈ 1 per 8–10 racks + spares)
Secondary-loop flow	384 × ~190 LPM	~73,000 L/min aggregate
Heat rejection	~61 MW total heat	towers/dry-coolers + adiabatic, economized
Water make-up (WUE ~0.5 L/kWh)	60.8 MW × 0.5 × 8,760 h	~266 ML/yr (≈ 700 m³/day peak)
Back-end fabric (1:1)	27,648 GPUs, 8-rail fat-tree	~3,456 leaf+spine switch ASICs (see §4)
Floor area (white space)	384 racks @ ~30 m²/rack incl. aisles/CDU	~11,500 m² + plant
Floor loading reserve	wet rack ~1.36 t + VR200 headroom	design slab to ≥1,500 kg/m²

TDP basis 132 kW/rack. Facility power applies PUE ≈ 1.2 (warm-water DLC + economized rejection). Generator/UPS sized to EDPp. Water assumes hybrid rejection with adiabatic assist; closed-loop dry would trade WUE→0 for ~+0.05 PUE.

4. Reference BOM: a 100k-GPU GB200/GB300 cluster

The flagship worked example: a 100,000-GPU GB200/GB300-class training cluster, costed as a bill of materials. Built from the SU: 100,000 ÷ 576 ≈ 174 SUs; round to 176 SUs = 1,408 NVL72 racks = 101,376 GPUs (≈ 100k). At 132 kW/rack that is ~186 MW IT, ~223 MW facility at PUE 1.2 — a multi-campus build in practice (scale-across over DCI, Chapter 8.8), but costed here as one logical cluster.

The cost column carries the heaviest caveat: GPU/system pricing is 2025 street/list (SemiAnalysis), networking and storage are practitioner ranges, and all of it moves quarterly. Use the counts as gospel (they are arithmetic) and the dollars as an order-of-magnitude frame. The GPU/system line dominates so heavily (~75–80% of cluster capex) that errors elsewhere barely move the total.

100k-GPU cluster reference BOM (176 SUs · 1,408 NVL72 racks · 101,376 GPUs)

BOM line	Count	Basis	Unit cost (2025–26)	Line cost (rough)
GB200/GB300 GPUs	101,376	176 SU × 576	~$60–70k effective	—
NVL72 racks (integrated, L11)	1,408	176 SU × 8	~$3.0–3.5M / rack	~$4.5–4.9B
— (rack line includes GPUs, Grace, NVSwitch, DLC)	—	GB200 ~$200k/GPU system-level	—	~$70–80B system total
CDUs (L2L, N+1)	~150	~1 per 8–10 racks + spares	~$120–180k	~$22–27M
Back-end leaf switches (Quantum-X800/Spectrum-X)	~2,000	8-rail, ~72 GPU/leaf group	~$120k (64×800G)	~$240M
Back-end spine switches	~1,000	2-tier fat-tree, 1:1	~$120k	~$120M
Front-end / storage / mgmt switches	~600	in-band + OOB + storage net	~$30–60k	~$25M
Back-end optics / transceivers (800G)	~405,000	101,376 GPU × ~4 (NIC+leaf+spine ends)	~$1,000–1,500	~$450–600M
DAC/AEC copper (intra-rack scale-up)	in-rack	5,184 NVLink cables/rack (copper, in rack price)	incl. in rack	—
High-perf storage (parallel FS)	~25 PB usable	~250 GB/s per 1,024 GPU → ~25 TB/s	~$0.20–0.40/GB flash tier	~$1.0–2.0B
Capacity / object tier	~150–250 PB	data lake + checkpoints	~$0.02–0.05/GB	~$5–12M
Facility power chain (per MW)	~223 MW facility	transformers, UPS/BESS, switchgear, gen	~$10–15M / MW (AI-grade)	~$2.2–3.3B
Cooling plant + water loop (per MW)	~186 MW IT	CDUs counted above + rejection + piping	~$3–5M / MW	~$0.6–0.9B
Cluster total (compute + network + storage + facility)	—	GPU/system dominates ~75–80%	—	~$80–95B

Counts are exact from §2 ratios. Unit costs are 2025–2026 street/list ranges (SemiAnalysis AI Neocloud Playbook; vendor lists); they drift quarterly and exclude land, building shell, and soft costs. Networking assumes 1:1 non-blocking 8-rail fat-tree; storage at ~250 GB/s per 1,024 GPUs.

Reading the BOM total

The ~$80–95B headline is dominated by the GPU/system line (~$70–80B at ~$200k/GPU system-level for GB200-class, per SemiAnalysis). Networking (~$0.9B), storage (~$1–2B), and the facility power/cooling shell (~$3–4B) together are under ~10% of the total — which is exactly why mis-scoping the fabric (a 1:1 vs 2:1 choice that swings ~31% of back-end cost, Chapter 8.5) matters far less to capex than getting the GPU generation, utilization, and depreciation window right (Chapter 1.8). Land and building shell are excluded; add ~$8–12M/MW for a greenfield powered shell (Chapter 6.1). The single largest source of total-cost error is not any line here — it is the GPU effective life (2–3 yr economic vs 5–6 yr book), bounded in Chapter 1.8 and registered in Appendix D.

132 / 142 kW

GB200 / GB300 NVL72 rack TDP; GB300 up to ~142 kW (483k BTU/hr) per rack

2026Schneider Electric / Converge Digest; Supermicro & Lenovo GB300 datasheets

576 GPUs / SU

DGX SuperPOD GB200 scalable unit = 8 NVL72 racks; full SuperPOD = 16 SU / 128 racks / 9,216 GPUs

2025NVIDIA DGX SuperPOD GB200 Reference Architecture

~600 kW

Rubin Ultra Kyber NVL576 rack on 800 VDC; ~4.8 MW per 8-rack SU

H2 2027 (announced)NVIDIA GTC; The Next Platform; DCD

~1.5 L/min·kW

DLC secondary-loop flow rule of thumb (PG25); ~130–200 LPM per NVL72 rack

2025Dober PG25; Introl NVL72 deployment

~1.4 MW / CDU

L2L CDU useful capacity; ~1 per 8–10 NVL72 racks at N+1; Google Deschutes row CDU ~2 MW

2025Vertiv/Eaton/nVent CDU specs; Google Cloud

8× 400 Gb/s

back-end NICs per training node = 3.2 Tb/s/node; 1:1 non-blocking 8-rail fat-tree

2025SemiAnalysis AI Neocloud Playbook

~$200k/GPU

GB200 system-level all-in (~$200M per 1,000 GPUs); ~$60–70k effective per GPU

2025SemiAnalysis; domain-research BOM

~250–400 GB/s

aggregate storage bandwidth per 1,024 training GPUs (~2 PB initial); place on front-end, not back-end

2025SemiAnalysis storage sizing

Sensitivity: how the three builds move when you change one input

Generation step (GB200 → VR200). Same 8-rack SU footprint, but IT power per SU goes ~1.06 → ~1.60 MW and flow ~1,580 → ~2,400 L/min. The 50 MW campus at VR200 density needs only ~31 SUs for the same 50 MW — but each hall now dissipates ~1.6× the heat per rack, so the cooling plant, not the floor, becomes binding. Oversubscription (1:1 → 2:1) for an inference cluster. Cuts back-end switches and optics by ~31% — on the 100k BOM that is ~$0.3B saved on a ~$90B total (≈ 0.3%), which is why you never compromise training fabric to save fabric money, but always oversubscribe an inference fabric. Closed-loop dry cooling. Drives the 50 MW campus's ~266 ML/yr water make-up toward zero, at the cost of ~+0.05 PUE (~+2.5 MW facility power) and a larger heat-rejection footprint — the WUE↔PUE trade from Chapter 15.4. Effective GPU life (3 yr → 5 yr). Does not move any count or capex line, but nearly doubles the denominator in $/GPU-hr — the dominant TCO lever, quantified in Chapter 1.8 and Appendix C.

These reference designs operationalize the archetype framework in Chapter 1.1 and the requirements matrix in Chapter 1.7; the economics that score them live in Chapter 1.8 with the calculators in Appendix C. The SU and BOM inherit: rack/integration detail from Chapter 7.13 and Chapter 7.14; the 800 VDC power chain from Chapter 4.7 and transient sizing from Chapter 4.5; CDU and warm-water loop sizing from Chapter 5.6 and Chapter 5.7; fabric topology and oversubscription from Chapter 8.5 and optics from Chapter 8.10; storage sizing from Chapter 9.8; multi-campus scale-across from Chapter 8.8. Every dated figure here is registered with vintage and scenario in Appendix D.