EU AI Act Compliance with Epistemic Verification

EU AI Act Compliance High-Risk AI Conformity Assessment

March 5, 2026 · 12 min read

The EU AI Act (Regulation 2024/1689) takes full effect for high-risk AI systems in August 2026. Providers must demonstrate compliance with Chapter 2 requirements covering risk management, transparency, human oversight, and conformity assessment.

Most organizations are still figuring out what this means in practice. The legislation describes what is required — but not how to implement it. This article maps specific EU AI Act articles to concrete technical implementations using epistemic verification.

What is epistemic verification? Multiple independent AI models evaluate the same decision. Their agreements, disagreements, and confidence levels are captured in a signed, immutable record called an epistemic block. This creates a verifiable audit trail of how an AI system reached a conclusion — and how hard it looked for reasons to doubt it.

The Mapping at a Glance

Article	Requirement	Implementation
Art. 9	Risk Management System	Adversarial critique + confidence scoring
Art. 13	Transparency	Multi-model dissent capture + MDI score
Art. 14	Human Oversight	Calibrated confidence + structured disagreement
Art. 43	Conformity Assessment	Signed epistemic block chain (audit trail)
Art. 11-12	Technical Documentation & Logging	Immutable block storage with full provenance

Article 9 — Risk Management System

What the Act says

"A risk management system shall be established, implemented, documented and maintained in relation to high-risk AI systems." — Art. 9(1)

The system must identify and analyze known and reasonably foreseeable risks, estimate and evaluate risks that may emerge, and adopt suitable risk management measures. — Art. 9(2)(a-d)

A single model evaluating its own output cannot satisfy this. It will consistently underestimate risks it was trained to ignore. Art. 9 demands a systematic process — identification, analysis, evaluation — not a confidence score from the same model that produced the output.

Epistemic verification addresses this structurally:

import { ThoughtProof } from 'pot-sdk';

const tp = new ThoughtProof({
  generators: ['gpt-4o', 'claude-sonnet', 'gemini-pro'],
  critic: 'claude-opus',
  mode: 'adversarial' // Critic actively looks for flaws
});

// AI system makes a medical recommendation
const aiOutput = "Based on symptoms, prescribe medication X at 200mg";

const block = await tp.verify({
  claim: aiOutput,
  context: "Patient: 67yo, kidney function GFR 45, current medications: Y, Z",
  domain: "medical"
});

// block.perspectives[]: Each model's independent assessment
// block.critique: Adversarial analysis (found: dosage risk for GFR < 50)
// block.confidence: 0.42 (low — significant dissent)
// block.dissent: ["Model 2: 200mg contraindicated for GFR < 50"]

The resulting epistemic block documents:

Risk identification (Art. 9(2)(a)): Each model independently identifies risks from different angles
Risk analysis (Art. 9(2)(b)): The adversarial critic explicitly probes for failure modes
Risk evaluation (Art. 9(2)(c)): Confidence scores and dissent ratios quantify risk
Mitigation (Art. 9(2)(d)): Low confidence triggers human review — the system doesn't proceed blindly

Key insight: The adversarial critic mode doesn't try to agree — it tries to find reasons the output is wrong. This is structurally equivalent to the "systematic risk identification" Art. 9 requires. A single model optimizing for helpfulness will never do this to its own output.

Article 13 — Transparency

What the Act says

"High-risk AI systems shall be designed and developed in such a manner as to ensure that their operation is sufficiently transparent to enable deployers to interpret the system's output and use it appropriately." — Art. 13(1)

Systems must provide information about the level of accuracy, robustness, and cybersecurity — and known limitations. — Art. 13(3)(b)

"Interpret the system's output" requires more than showing the final answer. It means understanding why the system reached that conclusion, how confident it is, and where it might be wrong.

An epistemic block captures exactly this:

// After verification, the block contains:
{
  "claim": "Prescribe medication X at 200mg",
  "mdi": 0.78,           // Model Diversity Index — how much models disagreed
  "confidence": 0.42,     // Calibrated confidence after adversarial critique

  "perspectives": [
    { "model": "gpt-4o",       "position": "support",  "confidence": 0.7 },
    { "model": "claude-sonnet", "position": "oppose",   "confidence": 0.85 },
    { "model": "gemini-pro",    "position": "partial",  "confidence": 0.5 }
  ],

  "critique": {
    "model": "claude-opus",
    "findings": [
      "Dosage exceeds safe threshold for patients with GFR < 50",
      "Drug interaction risk with medication Z not addressed"
    ],
    "recommendation": "Reduce dosage to 100mg and monitor renal function"
  },

  "synthesis": "Consensus: medication appropriate, dosage disputed. 2/3 models flag renal risk.",
  "hash": "sha256:a3f8e1c9...",
  "timestamp": "2026-03-05T19:00:00Z"
}

This satisfies Art. 13(3)(b) directly:

Level of accuracy: Confidence score + MDI show exactly how certain the system is
Robustness: Multi-model verification tests whether the output holds under different model perspectives
Known limitations: Dissent and critique explicitly document what could be wrong

The Model Diversity Index (MDI)

MDI measures how much independent models diverge on a given output. An MDI of 0.0 means perfect agreement; 1.0 means complete disagreement. For deployers, this is a single number that answers: "How much should I trust this output?"

This is transparency that a compliance officer can point to. Not "the model said 95% confident" (which is the model grading itself), but "three independent models were asked, one strongly disagreed, and here's why."

Article 14 — Human Oversight

What the Act says

"Human oversight measures shall aim at preventing or minimising the risks [...] in particular when such risks persist notwithstanding the application of other requirements." — Art. 14(2)

Humans must be able to "fully understand the capacities and limitations of the high-risk AI system" and have the "ability to decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse the output." — Art. 14(4)(a)(e)

Human oversight is not a checkbox. It requires that the human has enough information to actually override the AI — not just a button that says "reject." The human needs to understand why they might want to reject it.

Epistemic blocks enable this with three mechanisms:

// 1. Calibrative mode — re-scores confidence without adding objections
const block = await tp.verify({
  claim: aiOutput,
  criticMode: 'calibrative' // "How confident should a human be?"
});

// 2. Confidence thresholds — automatic escalation
if (block.confidence < 0.6) {
  await escalateToHuman(block);
  // Human sees: the claim, each model's position, the critique,
  // and a calibrated confidence score
}

// 3. The block itself is the oversight artifact
// It answers: What did each model think? Where did they disagree?
// What did the critic find? Is this within the system's capabilities?

The critical difference from traditional AI systems:

Traditional: System outputs "95% confident." Human sees a number. Has no basis to evaluate whether 95% is real or hallucinated.
Epistemic: System outputs "3 models evaluated. 1 strongly disagrees. Critic found 2 issues. Calibrated confidence: 42%." Human has structured information to make a real decision.

Art. 14(4)(a) says humans must "fully understand the capacities and limitations." A single confidence number is not understanding. A structured disagreement map is.

Article 43 — Conformity Assessment

What the Act says

Providers of high-risk AI systems must undergo a conformity assessment before placing their system on the market. For systems under Annex III, this may involve internal control (Annex VI) or third-party assessment by a notified body. — Art. 43(1)

The assessment must demonstrate compliance with all requirements in Chapter 2 (Articles 8-15). — Art. 43(3)

Conformity assessment requires documented evidence that a system meets every Chapter 2 requirement. Not a claim that it does — evidence that an auditor can verify.

A chain of epistemic blocks creates this evidence trail:

// Every verification produces a block with:
// - SHA-256 hash (integrity)
// - Timestamp (when)
// - Model identifiers (who verified)
// - Full reasoning chain (what was assessed)
// - Previous block hash (chain integrity)

// List all blocks for audit
const blocks = await tp.list({ domain: 'medical', from: '2026-01-01' });

// Each block is independently verifiable
for (const block of blocks) {
  const valid = await tp.verifyIntegrity(block);
  // Checks: hash matches content, chain is unbroken,
  // timestamps are sequential, model signatures are valid
}

For a notified body conducting a conformity assessment, this means:

Art. 9 evidence: Every block documents a risk assessment with adversarial testing
Art. 13 evidence: Every block captures transparency data (MDI, confidence, dissent)
Art. 14 evidence: Escalation patterns show when and why humans were involved
Art. 11-12 evidence: The block chain itself is the technical documentation and logging

Note on terminology: "Block chain" here refers to a chain of signed, hashed records — not blockchain/DLT technology. No distributed ledger is involved. Blocks are stored locally, on your infrastructure, under your control.

Articles 11-12 — Technical Documentation & Logging

What the Act says

Technical documentation must be drawn up before the system is placed on the market and kept up to date. — Art. 11(1)

High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system. — Art. 12(1)

Epistemic blocks are natively logged and immutable. Every verification event is automatically recorded with full provenance: which models were used, what they said, where they disagreed, and what the final synthesis was.

// Blocks are stored locally by default
// pot-cli: ./blocks/PoT-001.json, PoT-002.json, ...
// pot-sdk: configurable storage (local, S3, database)

// Each block contains Art. 12(2) required fields:
{
  "id": "PoT-047",
  "timestamp": "2026-03-05T19:00:00Z",
  "input": { /* the query/claim that was verified */ },
  "models": ["gpt-4o", "claude-sonnet", "gemini-pro"],
  "critic": "claude-opus",
  "duration_ms": 12400,
  "perspectives": [ /* full model outputs */ ],
  "critique": { /* adversarial analysis */ },
  "synthesis": "...",
  "confidence": 0.42,
  "mdi": 0.78,
  "hash": "sha256:a3f8e1c9...",
  "previousHash": "sha256:7b2d4e6f..."
}

No additional logging infrastructure required. The verification process is the log.

Data Residency & GDPR

For high-risk AI systems processing personal data (healthcare, HR, law enforcement), data residency is non-negotiable. The EU AI Act doesn't exist in a vacuum — GDPR still applies, and many organizations operate under additional sector-specific regulations.

ThoughtProof is designed for this:

Fully self-hostable. Run on your infrastructure with your models. No data leaves your network.
BYOK (Bring Your Own Keys). Use your own API keys with any provider — or use local models via Ollama/vLLM for complete air-gap.
No ThoughtProof servers involved. The protocol is a library, not a service. There is no "ThoughtProof cloud" that sees your data.

// .potrc.json — fully local configuration
{
  "generators": [
    { "name": "Qwen",  "model": "qwen2.5:72b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
    { "name": "Llama", "model": "llama3.3:70b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
    { "name": "Gemma", "model": "gemma3:27b",   "baseUrl": "http://ollama:11434/v1/chat/completions" }
  ],
  "critic":      { "name": "Critic", "model": "qwen2.5:72b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
  "synthesizer": { "name": "Synth",  "model": "llama3.3:70b", "baseUrl": "http://ollama:11434/v1/chat/completions" }
}

// Zero external API calls. Zero data exfiltration surface.
// Runs on a single server with Ollama.

This means: no GDPR data processing agreement with ThoughtProof is needed, because ThoughtProof never processes your data. The compliance burden stays where it belongs — with the organization operating the AI system.

What This Doesn't Solve

Honesty matters more than marketing. Here's what epistemic verification does not do:

It's not a conformity certificate. Notified bodies under Art. 31-39 issue conformity certificates. ThoughtProof produces evidence that supports conformity assessment — it doesn't replace the legal process.
It doesn't cover all Chapter 2 requirements. Art. 10 (Data Governance), Art. 15 (Accuracy, Robustness, Cybersecurity) have requirements that go beyond verification — training data quality, adversarial robustness testing, and infrastructure security need additional tooling.
Harmonised standards are still being written. CEN/CENELEC standardisation request M/593 will produce the technical standards that define what conformity looks like in practice. Until those are finalized, any mapping — including this one — is based on the legislation text, not on binding technical standards.
It's one tool, not the only tool. EU AI Act compliance is a multi-dimensional challenge. Epistemic verification addresses reasoning quality, transparency, and auditability. It works alongside — not instead of — data governance, security testing, and organizational measures.

Our position: ThoughtProof is structurally compatible with EU AI Act verification requirements. Whether it becomes a recognized tool in the conformity assessment ecosystem depends on institutional adoption, harmonised standards, and real-world validation. We're building in the open so that process can happen transparently.

Getting Started

# Install
npm install pot-sdk        # SDK for integration
npm install -g pot-cli     # CLI for standalone verification

# Or run fully local with Ollama
brew install ollama
ollama pull qwen2.5 llama3.2 gemma3
pot-cli ask "Your AI system's output here"

The protocol is open-source (MIT). The code is auditable. Every claim in this article can be verified by running the tools yourself.

pot-sdk on GitHub → pot-cli on GitHub → Protocol Specification →

Disclaimer: This article provides a technical analysis of how epistemic verification maps to EU AI Act requirements. It is not legal advice. Organizations should consult qualified legal counsel for their specific compliance obligations. ThoughtProof is not affiliated with any notified body, standardisation organisation, or regulatory authority.

ThoughtProof Protocol — Patent Pending (USPTO #63/984,669, DPMA 10 2026 000 928.6)