Chapter 11.11

Compliance, Certification & Governance

Compliance is not a stamp you collect at the end — it is a design constraint that decides, at scoping time, where your data may sit, which workloads you may host, who may touch the silicon, and how much of your engineering effort is permanently diverted into producing machine-readable evidence; choose the wrong framework portfolio and you have built a campus you cannot legally sell into.

POWER-BOUNDGOODPUT

What you'll decide here

Which compliance portfolio your campus underwrites against — the commercial baseline (SOC 2 + ISO 27001), the AI-governance layer (ISO 42001 + EU AI Act), the federal stack (FedRAMP 20x + CMMC), and any sector regimes (HIPAA, PCI DSS, FINRA, NERC CIP) — because the union of your target customers' mandates, not any single one, sets the design basis.
Whether you commit to compliance-as-code (OSCAL/KSIs, continuous control monitoring) from day one, or carry a manual-evidence program — a choice that is reversible in principle but very expensive to retrofit once controls, telemetry, and audit logging are already wired the manual way.
Which data-residency and sovereignty boundary your facility enforces — physical (data never leaves the jurisdiction) vs operational (foreign nationals and remote operators cannot administer it) vs control-of-stack (no foreign vendor or government can compel access) — because the three diverge and customers increasingly buy the strictest.
How you treat the supply-chain and firmware evidence chain (OCP S.A.F.E. reports, SBOMs, hardware root-of-trust attestation) as first-class compliance artifacts, not security afterthoughts, since auditors and federal authorizers now demand provenance you cannot reconstruct after commissioning.
Who owns the shared-responsibility boundary in a colo or neocloud arrangement — which controls you inherit, which you must implement, and which evidence the landlord will and will not produce — before a tenant audit discovers the gap.

An AI data center earns nothing until someone is allowed to put a regulated workload on it. The chapters before this one decide whether the machine is secure; this one decides whether it is sellable — whether a bank, a hospital network, a defense prime, an EU regulator, or a sovereign-AI program can lawfully place its data and its models inside your walls. Compliance is the layer that converts an engineering artifact into a commercial one, and it does so by imposing constraints that reach all the way back to siting, staffing, and the wiring of your audit logs. The recurring error is to treat it as a post-commissioning paperwork exercise. It is not. The framework portfolio you target is an upstream decision — as upstream as the workload archetype in Chapter 1.1 — because it dictates where data may reside, who may administer the silicon, and what evidence you must be generating continuously from the day the first GPU is energized.

This chapter is the compliance and governance map, taught as decisions and their downstream costs. We lay out the framework landscape — the commercial baseline, the new AI-governance layer, the federal stack, and the sector regimes — and show exactly how they differ in what they certify and what they cost. We treat NERC CIP as a portfolio member here while pointing to its canonical engineering home. We make the case for compliance-as-code (OSCAL, FedRAMP's Key Security Indicators, continuous control monitoring) as the only model that survives at AI-factory scale, and we build the control crosswalk that lets one body of evidence satisfy many frameworks. Finally we confront the three that are genuinely hard at the physical layer: audit logging at fleet scale, data residency and sovereignty, and the sector-specific overlays that customers bring with them.

The framework landscape: four layers, not one ladder

The single most common mistake is to imagine compliance as a ladder — SOC 2, then ISO, then FedRAMP, each a higher rung. It is not a ladder; it is four orthogonal layers, and a serious AI campus needs a chosen subset of each, driven by the union of its target customers' mandates. The layers are: the commercial-trust baseline that every B2B buyer expects; the AI-governance layer that is brand new and rising fast; the federal/government stack that gates public-sector revenue; and the sector regimes that ride along with regulated tenants. Each certifies a different thing, and confusing them is how operators over-spend on a stamp no customer asked for while missing the one that would have unlocked the contract.

SOC 2 (AICPA) is an attestation, not a certification: an independent auditor opines on whether your controls, against the five Trust Services Criteria, operated effectively. A Type II report covering a 6–12 month observation window is the de facto entry ticket to North American enterprise sales. ISO/IEC 27001 is a true certification against a normative information-security-management-system (ISMS) standard, issued by an accredited body, valid three years with annual surveillance audits — the global lingua franca, expected by European and Asian buyers. They overlap heavily in substance but differ in form: SOC 2 produces a detailed report a customer reads; ISO 27001 produces a certificate a customer trusts. Most operators serving a global market carry both, and the crosswalk between them is the first place compliance-as-code pays off.

The layer that did not exist three years ago is AI governance. ISO/IEC 42001:2023 — the world's first certifiable AI-management-system standard — moved through 2025 and into 2026 from novelty to procurement requirement: by mid-2026 the question "are you ISO 42001 certified or implementing it?" appears in roughly 40% of EU enterprise AI-vendor RFPs and ~25% in North America (industry RFP analyses, 2026). The EU AI Act is the regulatory teeth: GPAI-model obligations became applicable 2 August 2025, the AI Office's full enforcement powers (fines up to 7% of global turnover, model recalls) activate 2 August 2026, and most high-risk-system obligations apply from that same date (European Commission, 2025–2026). For an infrastructure operator the Act is mostly an inherited obligation — your tenants are the deployers — but residency, logging, and traceability requirements flow down to the facility, and ISO 42001 is the practical control framework operators use to demonstrate readiness.

The master fork: who is your worst-case regulator?

Your compliance portfolio is set not by your easiest customer but by your strictest one — and by the strictest jurisdiction any of them operates in. A campus that will host a US federal tenant inherits FedRAMP + CMMC and their data-residency and personnel constraints (US-persons-only administration, continuous monitoring, machine-readable evidence). A campus that will host EU healthcare or financial workloads inherits GDPR, the EU AI Act, DORA, and physical residency. These constraints are conjunctive: you must satisfy all of them simultaneously, and several are mutually awkward (a US-persons-only operations team vs an EU-data-must-be-administered-from-the-EU residency rule forces you to physically partition operations). Decide the worst-case regulator before siting and staffing, because the answer determines which country you build in, who you are allowed to hire to run it, and whether one campus can serve both books of business or you need two. → siting in Chapter 3.1; personnel security in Chapter 11.9.

The federal stack: FedRAMP 20x and CMMC

The US federal layer is where compliance stops being a report and becomes a machine-readable, continuously-validated obligation — and 2025–2026 is precisely the inflection. FedRAMP 20x is the modernization of the federal cloud-authorization program, and it inverts the old model. Instead of writing essays against hundreds of NIST SP 800-53 controls, providers now demonstrate a set of Key Security Indicators (KSIs) — measurable security outcomes validated through automation (56 KSIs at the Low baseline, 61 at Moderate). The pivot from describing controls to measuring them is the whole point: it replaces manual narrative with automated evidence-gathering. Critically, RFC-0024 mandates machine-readable packages, including OSCAL, for all FedRAMP providers by September 2026, and Phase 3 opens 20x to all qualifying providers in Q3 2026 (FedRAMP PMO, 2026). If you intend to serve federal AI workloads, compliance-as-code is no longer a sophistication choice — it is the entry condition.

CMMC (Cybersecurity Maturity Model Certification) governs defense-industrial-base contractors handling Controlled Unclassified Information. The long rulemaking finally closed: the 48 CFR final rule published 10 September 2025, and Phase 1 of the rollout began 10 November 2025; Level 2 third-party certification becomes mandatory for CUI-handling contracts from 10 November 2026 (Phase 2), with Level 3 in 2027 and full implementation by 2028 (DoD, 2025–2026). For an AI campus chasing defense workloads, the consequence is concrete and personnel-shaped: CMMC Level 2 demands US-persons access control, physical separation of CUI enclaves, and an audit trail that survives a C3PAO assessor's scrutiny — constraints that ripple into staffing, network segmentation (Chapter 11.7), and tenant isolation (Chapter 11.6).

The compliance framework portfolio — what each layer certifies and what it costs you

Framework	Layer	Form & cadence	What it actually certifies	Downstream design cost
SOC 2 (AICPA)	Commercial baseline	Auditor attestation; Type II over 6–12 mo window	That your stated controls operated effectively over a period	Continuous evidence + change management; mostly process, light facility burden
ISO/IEC 27001	Commercial baseline	Accredited certification; 3-yr cycle + annual surveillance	A conformant ISMS against a normative standard	Formal ISMS, risk treatment, Statement of Applicability; documentation-heavy
ISO/IEC 42001	AI governance	Accredited certification; 3-yr cycle	A conformant AI-management system (lifecycle, impact, oversight)	AI risk & impact assessments, model/data governance; flows to tenant onboarding
EU AI Act	AI governance (law)	Regulation; GPAI live Aug 2025, enforcement Aug 2026	Legal conformity of AI systems by risk tier (deployer-led)	Residency, logging, traceability flow-down; up to 7% global-turnover fines
FedRAMP 20x	Federal stack	Authorization; KSIs + OSCAL machine-readable (by Sep 2026)	Continuous, automated validation of federal security outcomes	Compliance-as-code mandatory; continuous monitoring; US-data-center siting
CMMC Level 2	Federal stack	C3PAO certification; mandatory for CUI from Nov 2026	NIST SP 800-171 protection of CUI in the defense supply chain	US-persons access, CUI enclave separation, audited 3-yr certification
NERC CIP	Sector (grid)	Mandatory standards; audited by Regional Entity	Cyber/physical protection of bulk-electric-system assets	Applies when transmission-connected/co-gen; see Chapter 4.3
HIPAA / PCI DSS / DORA	Sector (tenant-borne)	Varies; BAA, QSA assessment, regulatory	Protection of PHI / cardholder data / financial-sector resilience	Inherited from tenant; residency, encryption, BCP/DR overlays

Form, scope, and cadence are 2026-current. "Design cost" is the downstream facility/operations burden the framework imposes, not the audit fee. Dates per FedRAMP PMO, DoD 48 CFR, European Commission, and ISO (2025–2026).

Read the rightmost column as the real cost. The audit fee is rounding error against the design burden each framework imposes — and the burdens are not additive in a friendly way. A campus targeting the full portfolio inherits the union of every constraint: US-persons administration from CMMC, machine-readable continuous evidence from FedRAMP, physical residency from the EU AI Act and GDPR, and a formal ISMS from ISO 27001. The strategic move is to identify the binding portfolio your target book of business actually requires, then build one body of evidence that satisfies it through a crosswalk, rather than chasing every stamp or running parallel programs that each re-collect the same logs.

NERC CIP in the compliance portfolio

NERC CIP — the Critical Infrastructure Protection standards — is the one framework on this list whose engineering substance lives elsewhere in the guide, and we are careful here to place it rather than re-derive it. The full treatment of substation ownership, transmission interconnection, and the CIP obligations that attach when an AI campus becomes a transmission-connected load or operates on-site generation is the canonical material of Chapter 4.3. Its place in the compliance portfolio is the point here: as AI campuses cross 300 MW, 500 MW, and gigawatt scale, they stop being ordinary load and become grid actors in NERC's eyes — a shift NERC made explicit after multi-hundred-megawatt synchronized load-loss events (the 2024 Virginia event dropped roughly 1,500 MW on a single 230 kV fault, triggering NERC's rare Level 3 alert). Once you own or co-locate bulk-electric-system assets, CIP's cyber-physical asset-inventory, access-control, and incident-reporting obligations become mandatory and audited — a distinct, sometimes-surprising line item in the governance budget. The grid-interactive engineering that triggers it is in Chapter 4.10.

56 / 61

FedRAMP 20x Key Security Indicators (Low / Moderate baseline) — automated, measurable outcomes replacing control-narrative essays

2026FedRAMP PMO RFC-0006

Sep 2026

RFC-0024 deadline: machine-readable (OSCAL) packages mandatory for all FedRAMP providers

2026FedRAMP PMO RFC-0024

10 Nov 2026

CMMC Level 2 third-party certification becomes mandatory for CUI-handling DoD contracts (Phase 2)

2025-2026DoD 48 CFR final rule

2 Aug 2026

EU AI Act full enforcement powers activate; most high-risk obligations apply; fines up to 7% global turnover

2026European Commission

~40% / ~25%

EU / North American enterprise AI-vendor RFPs asking for ISO 42001 certification or implementation

2026Industry RFP analyses

3 SRPs

OCP S.A.F.E. accredited Security Review Providers (Atredis, IOActive, NCC Group) for firmware-security conformance audits

2025-2026Open Compute Project

~1,500 MW

single-event load loss that pushed NERC to treat large AI loads as grid actors subject to CIP-adjacent scrutiny

2026NERC Level 3 Alert / Utility Dive

3 yr

ISO 27001 / 42001 certification validity with annual surveillance audits; SOC 2 Type II re-issued every 6–12 mo

2026ISO; AICPA

Compliance-as-code: OSCAL, KSIs and continuous control monitoring

At AI-factory scale the manual-evidence model breaks. A point-in-time audit that samples controls once a year cannot describe a fleet of tens of thousands of accelerators whose firmware, configuration, and tenancy change weekly. The federal program saw this first, which is why FedRAMP 20x is built on continuous, automated validation rather than periodic attestation — and why OSCAL (the NIST Open Security Controls Assessment Language) is the connective tissue. OSCAL expresses controls, implementations, and assessment results in machine-readable JSON/XML/YAML, so a control's evidence is produced by the system that enforces it rather than transcribed by a human into a document. The decision an operator faces is whether to adopt this model from day one or to bolt it on later. It is nominally reversible — you can retrofit — but the retrofit is punishing: every control, every telemetry source, and every audit-log pipeline that was wired for human consumption must be re-instrumented for machine consumption, and the institutional habit of essay-writing is hard to unlearn.

The mechanism that makes this tractable is the control crosswalk: a single mapping from one body of collected evidence to the many frameworks it satisfies. A control that proves "all administrative access is MFA-gated and logged" satisfies a SOC 2 CC6 criterion, an ISO 27001 Annex A access control, a FedRAMP KSI, and a CMMC SP 800-171 requirement simultaneously. Collect it once, in OSCAL, and the crosswalk fans it out to every report. Run separate programs and you collect it four times, with four chances to drift out of sync. The crosswalk is where compliance-as-code converts from a federal obligation into a cost-saving discipline for the entire portfolio.

Deep dive: building one control crosswalk that feeds the whole portfolio

The crosswalk is a many-to-many mapping, and getting its granularity right is the engineering. Too coarse ("we have access control") and an auditor cannot trace a finding to evidence; too fine (one mapping per control sentence) and the map becomes unmaintainable. The workable unit is the technical control — a discrete, automatable assertion with a single evidence source: all OOB management interfaces require hardware-attested identity and log to an immutable store; tenant network segments are enforced by validated microsegmentation policy; all GPU firmware is signed, measured at boot, and matches an approved RIM. Each technical control is authored once, in OSCAL, with its evidence query, and then tagged with every framework requirement it satisfies.

The payoff compounds. When FedRAMP 20x asks for a KSI, the answer is already produced by the same telemetry that feeds the SOC 2 Type II observation window and the ISO 27001 surveillance audit. When the firmware-integrity control changes — a new attestation chain, a new RIM source — you edit one node and every downstream report updates. The failure mode to avoid is the siloed pattern where each framework gets its own spreadsheet, its own evidence pull, and its own quarterly fire drill; that pattern does not scale past a single product and collapses entirely when continuous monitoring becomes mandatory. The hardware-integrity evidence that anchors several of these controls is engineered in Chapter 11.3 (supply chain) and Chapter 11.4 (root of trust); the confidential-computing attestation that proves in-use protection is in Chapter 11.5.

Supply-chain and firmware provenance as compliance artifacts

The newest demand on the compliance program is that hardware provenance is now auditable. Federal authorizers and serious enterprise buyers no longer accept "we bought it from a reputable vendor"; they want the evidence chain — and you cannot reconstruct it after commissioning. The OCP S.A.F.E. program is the emerging standard here: an Open Compute Project framework under which device vendors commission an accredited Security Review Provider (Atredis Partners, IOActive, and NCC Group are the enrolled SRPs as of 2025–2026) to produce a standardized short-form firmware-security audit that travels with the device on the OCP Marketplace. The value to an operator is de-duplication: instead of every buyer re-auditing the same BMC firmware, the conformance review is performed once and consumed many times — and from 2026-10-01, S.O.L.I.D. requirement gaps must be itemized in the long-form reports, closing the loophole where a one-time review went stale after the next patch.

The decision for an operator is whether to make these artifacts first-class in the compliance program or to treat them as security trivia. The consequence of getting it wrong is discovered late and expensively: a federal authorizer or a sovereign-AI customer asks for the firmware provenance chain, the SBOM, and the root-of-trust attestation history for the accelerators, and an operator who did not capture them at receiving and burn-in (Chapter 13.8) cannot produce them. The provenance must be captured as the hardware enters the building, bound to a hardware root of trust, and carried in the same OSCAL evidence store as everything else — because by 2026 it is no longer a security nicety, it is a line item in the authorization package.

Audit logging at fleet scale

Every framework on the list demands an audit trail, and at AI-factory scale the audit trail is itself an engineering problem. The naive read of "log everything" is unaffordable and unsearchable: a 100,000-GPU campus generates control-plane, data-plane, OOB-management, physical-access, and environmental telemetry at a volume that overwhelms a conventional SIEM. The decision is what to log immutably, how long to retain it, and where it may physically reside — and the last of these collides directly with data residency. A US-federal tenant requires logs retained and queryable on US soil; an EU tenant requires the same logs never leave the EU; a defense tenant requires the CUI-enclave logs physically and logically separated from everything else. The single-pane SIEM fantasy fractures along the same jurisdictional lines the rest of the campus does.

The non-negotiable property is immutability with provable integrity: audit records that a compromised administrator cannot alter or delete, because in an insider-threat or incident scenario the logs are the evidence (Chapter 11.9). Write-once storage, cryptographic chaining, and out-of-band collection to an enclave the production operators cannot reach are the standard pattern. The forward link is that these same audit logs are the raw material for detection and incident response — the security-operations treatment is Chapter 11.12, and the unified incident-command model that consumes them is in Chapter 14.11.

Data residency, sovereignty and sector regimes

Residency and sovereignty are where compliance reaches back and rewrites the site plan. The critical insight — and the one most operators get wrong — is that sovereignty has three distinct meanings that diverge. Physical residency means the data is stored and processed within a defined jurisdiction; it is the easiest to satisfy and the weakest guarantee. Operational sovereignty means the people and systems that administer the data are also within the jurisdiction — no foreign-national administrator, no remote operations center in another country, no support engineer who can pull data across a border. Control-of-stack sovereignty is the strongest: no foreign vendor, government, or court can compel access to the data, regardless of where it physically sits — which can fail even when physical residency holds, if the hardware, software, or operator is subject to a foreign jurisdiction's extraterritorial reach. Empirical study of hundreds of nominally-sovereign non-US data centers found that physical residency routinely coexists with deep stack-level dependencies that defeat the sovereignty the customer thought they bought (sovereignty research, 2025).

The consequence is a fork at siting and procurement. A customer buying physical residency can be served from a leased hall in-region. A customer buying operational sovereignty forces you to staff the campus with in-jurisdiction personnel and route all administration through in-region operations — which collides head-on with a US-persons-only CMMC requirement if you are trying to serve both books from one site. A customer buying control-of-stack sovereignty may reject your hardware vendor, your hypervisor, or your remote-management path entirely. Export controls compound this: the 2025 AI-diffusion framework tiers countries and restricts chip and model-weight movement, so the same accelerators that are routine in one jurisdiction may be export-controlled into another (export-control analyses, 2025). Sovereign-AI demand — national programs that require the entire stack be domestically controlled — is a fast-growing slice of the market, and it is the one segment where compliance, not engineering, is the binding constraint.

The shared-responsibility gap in colo and neocloud

If your capacity is leased rather than self-built (Chapter 1.6), the most dangerous compliance failure is the shared-responsibility gap: the set of controls that neither party actually owns because each assumed the other did. A colo landlord certifies the physical facility and the power/cooling envelope; you certify the IT, the workloads, and the data. But the boundary is fuzzy exactly where it matters — who owns the audit logging for the OOB management network? Who attests the firmware on the landlord-supplied switches? Whose evidence satisfies the residency requirement for the backup site? When a tenant audit or a federal authorizer probes the seam, the answer "we thought the landlord had it" is fatal. The fix is contractual and upfront: a written shared-responsibility matrix mapping every control in your portfolio to an owner, plus a commitment to the specific evidence the landlord will produce (their SOC 2, their ISO scope, their facility access logs) — negotiated before you sign, not discovered during the audit.

The sector regimes ride along with the tenant and overlay the baseline. HIPAA (US healthcare) requires a Business Associate Agreement and imposes encryption, access-control, and breach-notification obligations on PHI. PCI DSS (payment-card data) requires a QSA assessment and tight network segmentation of the cardholder-data environment. DORA (EU financial sector) adds operational-resilience, third-party-risk, and incident-reporting obligations that flow to the infrastructure provider as a critical ICT third party. FINRA/SEC records rules impose WORM-style retention. None of these is the operator's to certify in isolation — they are inherited — but each adds residency, encryption, retention, or BCP/DR constraints (Chapter 12.3) that the facility must already satisfy when the regulated tenant arrives. The disciplined operator maps the sector overlays its target tenants will bring before commissioning, so the campus is built to the strictest overlay rather than retrofitted to it.

Deep dive: governance — who owns the compliance program, and how it stays current

Frameworks are static; the governance that keeps a campus compliant is a living function, and its ownership is a real decision. The anti-pattern is a compliance team bolted onto the side of engineering, discovering at audit time that the platform drifted out of conformance six months ago. The pattern that works treats governance as a control-ownership graph: every technical control has a named engineering owner who is accountable for keeping its evidence green, and a governance function that maintains the crosswalk, tracks framework changes, and runs the continuous-monitoring dashboard. When a control goes red — a firmware RIM mismatch, a residency-violating data flow, a lapsed surveillance audit — it routes to the owner, not to a quarterly remediation scramble.

Keeping current is the harder half, because the frameworks themselves are moving fast in 2026: FedRAMP's OSCAL mandate lands in September, CMMC Level 2 becomes mandatory in November, the EU AI Act's enforcement powers activate in August, and OCP S.A.F.E.'s long-form reporting tightens in October. A governance function that is not actively tracking these dates will be non-compliant by surprise. The structural answer is to wire framework-change tracking into the same change-management process that governs the platform, so a regulatory deadline is treated like any other dependency with a delivery date. The organizational home for this — and the workforce and incident-command structure it plugs into — is Chapter 14.11.

Anti-patterns

The same governance failures recur, because each comes from treating compliance as downstream paperwork rather than an upstream design constraint:

Certification theater — collecting stamps no customer required. Pursuing a framework because a competitor has it, while the actual binding portfolio for your book of business goes unmet. The fix is to derive the portfolio from the union of target-customer mandates, not from a marketing checklist.
Manual evidence at machine scale. Running a spreadsheet-and-screenshot compliance program against a fleet whose state changes weekly. It passes the first audit, drifts immediately, and collapses the moment continuous monitoring becomes mandatory under FedRAMP 20x.
Residency as an afterthought. Selling "sovereign" capacity that satisfies physical residency while the remote-management path, the backup site, or the vendor support channel quietly defeats operational or control-of-stack sovereignty — discovered by the customer's auditor, not yours.
Unowned shared-responsibility seams. Leasing capacity without a written control-ownership matrix, so the OOB-logging or firmware-attestation control belongs to no one until an authorizer asks who owns it.
Provenance you cannot reconstruct. Skipping firmware-security audits, SBOMs, and root-of-trust attestation capture at receiving — then being unable to produce the evidence chain when a federal or sovereign customer demands it.

This chapter places NERC CIP in the portfolio; its engineering home is Chapter 4.3, with the grid-interactive behavior that triggers it in Chapter 4.10. The hardware-integrity evidence that anchors the control crosswalk is built in Chapter 11.3 (supply chain), Chapter 11.4 (root of trust), and Chapter 11.5 (confidential computing); tenant isolation that residency and CUI separation depend on is Chapter 11.6 and Chapter 11.7; insider-threat controls behind immutable audit logging are Chapter 11.9. The logs this chapter mandates feed detection and incident response in Chapter 11.12 and the incident-command model in Chapter 14.11. Provenance capture happens at burn-in in Chapter 13.8; the DR/continuity obligations that sector regimes impose are Chapter 12.3; the compliance portfolio is set as upstream as the workload archetype in Chapter 1.1 and the procurement fork in Chapter 1.6.