The EU AI Act (Regulation 2024/1689) takes full effect for high-risk AI systems in August 2026. Providers must demonstrate compliance with Chapter 2 requirements covering risk management, transparency, human oversight, and conformity assessment.
Most organizations are still figuring out what this means in practice. The legislation describes what is required — but not how to implement it. This article maps specific EU AI Act articles to concrete technical implementations using epistemic verification.
What is epistemic verification? Multiple independent AI models evaluate the same decision. Their agreements, disagreements, and confidence levels are captured in a signed, immutable record called an epistemic block. This creates a verifiable audit trail of how an AI system reached a conclusion — and how hard it looked for reasons to doubt it.
| Article | Requirement | Implementation |
|---|---|---|
| Art. 9 | Risk Management System | Adversarial critique + confidence scoring |
| Art. 13 | Transparency | Multi-model dissent capture + MDI score |
| Art. 14 | Human Oversight | Calibrated confidence + structured disagreement |
| Art. 43 | Conformity Assessment | Signed epistemic block chain (audit trail) |
| Art. 11-12 | Technical Documentation & Logging | Immutable block storage with full provenance |
"A risk management system shall be established, implemented, documented and maintained in relation to high-risk AI systems." — Art. 9(1)
The system must identify and analyze known and reasonably foreseeable risks, estimate and evaluate risks that may emerge, and adopt suitable risk management measures. — Art. 9(2)(a-d)
A single model evaluating its own output cannot satisfy this. It will consistently underestimate risks it was trained to ignore. Art. 9 demands a systematic process — identification, analysis, evaluation — not a confidence score from the same model that produced the output.
Epistemic verification addresses this structurally:
import { ThoughtProof } from 'pot-sdk';
const tp = new ThoughtProof({
generators: ['gpt-4o', 'claude-sonnet', 'gemini-pro'],
critic: 'claude-opus',
mode: 'adversarial' // Critic actively looks for flaws
});
// AI system makes a medical recommendation
const aiOutput = "Based on symptoms, prescribe medication X at 200mg";
const block = await tp.verify({
claim: aiOutput,
context: "Patient: 67yo, kidney function GFR 45, current medications: Y, Z",
domain: "medical"
});
// block.perspectives[]: Each model's independent assessment
// block.critique: Adversarial analysis (found: dosage risk for GFR < 50)
// block.confidence: 0.42 (low — significant dissent)
// block.dissent: ["Model 2: 200mg contraindicated for GFR < 50"]
The resulting epistemic block documents:
Key insight: The adversarial critic mode doesn't try to agree — it tries to find reasons the output is wrong. This is structurally equivalent to the "systematic risk identification" Art. 9 requires. A single model optimizing for helpfulness will never do this to its own output.
"High-risk AI systems shall be designed and developed in such a manner as to ensure that their operation is sufficiently transparent to enable deployers to interpret the system's output and use it appropriately." — Art. 13(1)
Systems must provide information about the level of accuracy, robustness, and cybersecurity — and known limitations. — Art. 13(3)(b)
"Interpret the system's output" requires more than showing the final answer. It means understanding why the system reached that conclusion, how confident it is, and where it might be wrong.
An epistemic block captures exactly this:
// After verification, the block contains:
{
"claim": "Prescribe medication X at 200mg",
"mdi": 0.78, // Model Diversity Index — how much models disagreed
"confidence": 0.42, // Calibrated confidence after adversarial critique
"perspectives": [
{ "model": "gpt-4o", "position": "support", "confidence": 0.7 },
{ "model": "claude-sonnet", "position": "oppose", "confidence": 0.85 },
{ "model": "gemini-pro", "position": "partial", "confidence": 0.5 }
],
"critique": {
"model": "claude-opus",
"findings": [
"Dosage exceeds safe threshold for patients with GFR < 50",
"Drug interaction risk with medication Z not addressed"
],
"recommendation": "Reduce dosage to 100mg and monitor renal function"
},
"synthesis": "Consensus: medication appropriate, dosage disputed. 2/3 models flag renal risk.",
"hash": "sha256:a3f8e1c9...",
"timestamp": "2026-03-05T19:00:00Z"
}
This satisfies Art. 13(3)(b) directly:
MDI measures how much independent models diverge on a given output. An MDI of 0.0 means perfect agreement; 1.0 means complete disagreement. For deployers, this is a single number that answers: "How much should I trust this output?"
This is transparency that a compliance officer can point to. Not "the model said 95% confident" (which is the model grading itself), but "three independent models were asked, one strongly disagreed, and here's why."
"Human oversight measures shall aim at preventing or minimising the risks [...] in particular when such risks persist notwithstanding the application of other requirements." — Art. 14(2)
Humans must be able to "fully understand the capacities and limitations of the high-risk AI system" and have the "ability to decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse the output." — Art. 14(4)(a)(e)
Human oversight is not a checkbox. It requires that the human has enough information to actually override the AI — not just a button that says "reject." The human needs to understand why they might want to reject it.
Epistemic blocks enable this with three mechanisms:
// 1. Calibrative mode — re-scores confidence without adding objections
const block = await tp.verify({
claim: aiOutput,
criticMode: 'calibrative' // "How confident should a human be?"
});
// 2. Confidence thresholds — automatic escalation
if (block.confidence < 0.6) {
await escalateToHuman(block);
// Human sees: the claim, each model's position, the critique,
// and a calibrated confidence score
}
// 3. The block itself is the oversight artifact
// It answers: What did each model think? Where did they disagree?
// What did the critic find? Is this within the system's capabilities?
The critical difference from traditional AI systems:
Art. 14(4)(a) says humans must "fully understand the capacities and limitations." A single confidence number is not understanding. A structured disagreement map is.
Providers of high-risk AI systems must undergo a conformity assessment before placing their system on the market. For systems under Annex III, this may involve internal control (Annex VI) or third-party assessment by a notified body. — Art. 43(1)
The assessment must demonstrate compliance with all requirements in Chapter 2 (Articles 8-15). — Art. 43(3)
Conformity assessment requires documented evidence that a system meets every Chapter 2 requirement. Not a claim that it does — evidence that an auditor can verify.
A chain of epistemic blocks creates this evidence trail:
// Every verification produces a block with:
// - SHA-256 hash (integrity)
// - Timestamp (when)
// - Model identifiers (who verified)
// - Full reasoning chain (what was assessed)
// - Previous block hash (chain integrity)
// List all blocks for audit
const blocks = await tp.list({ domain: 'medical', from: '2026-01-01' });
// Each block is independently verifiable
for (const block of blocks) {
const valid = await tp.verifyIntegrity(block);
// Checks: hash matches content, chain is unbroken,
// timestamps are sequential, model signatures are valid
}
For a notified body conducting a conformity assessment, this means:
Note on terminology: "Block chain" here refers to a chain of signed, hashed records — not blockchain/DLT technology. No distributed ledger is involved. Blocks are stored locally, on your infrastructure, under your control.
Technical documentation must be drawn up before the system is placed on the market and kept up to date. — Art. 11(1)
High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system. — Art. 12(1)
Epistemic blocks are natively logged and immutable. Every verification event is automatically recorded with full provenance: which models were used, what they said, where they disagreed, and what the final synthesis was.
// Blocks are stored locally by default
// pot-cli: ./blocks/PoT-001.json, PoT-002.json, ...
// pot-sdk: configurable storage (local, S3, database)
// Each block contains Art. 12(2) required fields:
{
"id": "PoT-047",
"timestamp": "2026-03-05T19:00:00Z",
"input": { /* the query/claim that was verified */ },
"models": ["gpt-4o", "claude-sonnet", "gemini-pro"],
"critic": "claude-opus",
"duration_ms": 12400,
"perspectives": [ /* full model outputs */ ],
"critique": { /* adversarial analysis */ },
"synthesis": "...",
"confidence": 0.42,
"mdi": 0.78,
"hash": "sha256:a3f8e1c9...",
"previousHash": "sha256:7b2d4e6f..."
}
No additional logging infrastructure required. The verification process is the log.
For high-risk AI systems processing personal data (healthcare, HR, law enforcement), data residency is non-negotiable. The EU AI Act doesn't exist in a vacuum — GDPR still applies, and many organizations operate under additional sector-specific regulations.
ThoughtProof is designed for this:
// .potrc.json — fully local configuration
{
"generators": [
{ "name": "Qwen", "model": "qwen2.5:72b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
{ "name": "Llama", "model": "llama3.3:70b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
{ "name": "Gemma", "model": "gemma3:27b", "baseUrl": "http://ollama:11434/v1/chat/completions" }
],
"critic": { "name": "Critic", "model": "qwen2.5:72b", "baseUrl": "http://ollama:11434/v1/chat/completions" },
"synthesizer": { "name": "Synth", "model": "llama3.3:70b", "baseUrl": "http://ollama:11434/v1/chat/completions" }
}
// Zero external API calls. Zero data exfiltration surface.
// Runs on a single server with Ollama.
This means: no GDPR data processing agreement with ThoughtProof is needed, because ThoughtProof never processes your data. The compliance burden stays where it belongs — with the organization operating the AI system.
Honesty matters more than marketing. Here's what epistemic verification does not do:
Our position: ThoughtProof is structurally compatible with EU AI Act verification requirements. Whether it becomes a recognized tool in the conformity assessment ecosystem depends on institutional adoption, harmonised standards, and real-world validation. We're building in the open so that process can happen transparently.
# Install
npm install pot-sdk # SDK for integration
npm install -g pot-cli # CLI for standalone verification
# Or run fully local with Ollama
brew install ollama
ollama pull qwen2.5 llama3.2 gemma3
pot-cli ask "Your AI system's output here"
The protocol is open-source (MIT). The code is auditable. Every claim in this article can be verified by running the tools yourself.
pot-sdk on GitHub → pot-cli on GitHub → Protocol Specification →
Disclaimer: This article provides a technical analysis of how epistemic verification maps to EU AI Act requirements. It is not legal advice. Organizations should consult qualified legal counsel for their specific compliance obligations. ThoughtProof is not affiliated with any notified body, standardisation organisation, or regulatory authority.
ThoughtProof Protocol — Patent Pending (USPTO #63/984,669, DPMA 10 2026 000 928.6)