Skip to content

Deep Dive: Solving the Agentic Compliance Gap

In April 2026, researchers published an in-depth analysis on the intersection of AI agents and the European Union Artificial Intelligence Act. The core finding was stark: Current agent frameworks lack the "fourth tier" of runtime observability and governance necessary to pass a conformity assessment.

Lár was explicitly re-engineered to solve the unique, agent-specific legal challenges identified in this research. This document provides a direct, technical mapping of how Lár's compliance primitives tackle the exact problems raised by the regulation.


1. Action-Chain Causal Auditability (Art. 12)

The Problem: The EU AI Act's logging requirements (Article 12, operationalized via prEN ISO/IEC 24970) demand sufficient traceability. The paper notes: "logging must capture not only each individual step but the causal relationships between them: why did the agent select this tool rather than that one? ... none produce audit trails that meet Article 12’s requirement." Furthermore, because agents can hallucinate or misreport their actions, the audit must independently verify the system state.

The Lár Solution: Causal Trace Logging

  • Independent State Verification: Instead of asking the LLM what it did, Lár's GraphExecutor computes an exact, mathematical state_diff (additions, modifications, deletions) after every single node executes.
  • Causal Linking: Lár captures explicit __reasoning_trace variables directly from the node (e.g., catching DeepSeek R1's <think> tags or OpenAI o1's reasoning logs), along with the exact prompt and system_instruction used, and binds them cryptographically to the exact state diff in the AuditLogger's JSON output. This mathematically proves why an action occurred.

Real output — run.py FINANCE, Run ID 037c96e8, Step 1 (LLMNode):

{
  "step": 1,
  "node": "LLMNode",
  "prompt": "You are a credit risk analyst. Assess the following loan/credit application.\nApplication: Credit application from business client. Requested limit: €500,000. Current D/E ratio: 4.2. Three missed payments in the last 18 months...\n\nReply with ONLY a single JSON object: risk_level (LOW/MEDIUM/HIGH/CRITICAL), recommendation (max 2 sentences), confidence (float 0.0-1.0). No prose.",
  "state_diff": {
    "added": {
      "ai_output": "{\n  \"risk_level\": \"CRITICAL\",\n  \"recommendation\": \"Do not approve the loan application due to high debt-equity ratio and missed payments.\",\n  \"confidence\": 0.95\n}"
    },
    "removed": {},
    "updated": {}
  },
  "run_metadata": {
    "prompt_tokens": 100,
    "output_tokens": 70,
    "total_tokens": 170,
    "model": "ollama/phi4:latest"
  },
  "outcome": "success"
}
HMAC-SHA256 signature of full log: 55931245a2c8117f1c1dc4f6b4499b866f272d99bd9273cd01d313e435a658a5

2. Privilege Minimisation Outside the Model (Art. 15(4))

The Problem: Agents interact with the world via "tools". Under the AI Cyber Resilience standard (prEN 18282), systems must enforce the principle of least privilege. Providing an agent with static, high-level API keys creates a catastrophic attack surface for prompt injection or autonomous drift.

The Lár Solution: The Credential Vault Lár implements a CredentialVault directly inside the ToolNode.

  • Just-in-Time (JIT) Provisioning: Tools do not hold static global credentials. When a tool is invoked, the vault provisions a time-scoped, scope-restricted token strictly bound to the immediate action.
  • NHI Governance: This creates an explicit Non-Human Identity (NHI) boundary, isolating compromised nodes from escalating privileges.

3. The "Automation Boundary" (Art. 14)

The Problem: Article 14 requires human oversight for high-risk systems to "override or reverse the system’s output". The paper notes that current architectures lack infrastructure to safely pause, await human review, and selectively resume an autonomous process without breaking the session.

The Lár Solution: Risk-Scored Routing & Human Juries

  • The PolicyRegistry: Every tool and action is mapped to an ontology defining its regulatory domain and risk tier.
  • The RiskScorerNode: This node evaluates the runtime state against the policy. If the action exceeds a predefined risk threshold (e.g., executing a financial transaction or making a medical assessment), it physically halts the autonomous loop.
  • The HumanJuryNode: The system yields control to a human stakeholder securely over the CLI or via API webhook. The human can approve, reject, or modify the state manually before the graph is allowed to resume.

4. Runtime Behavioral Drift (Art. 3(23))

The Problem: If an agent dynamically discovers or invents new tools during runtime, it fundamentally alters its capabilities. Article 3(23) classifies this as a "Substantial Modification", which instantly voids the system's CE marking and requires a brand new conformity assessment.

The Lár Solution: The Runtime State Versioner Lár uses a RuntimeStateVersioner plugged directly into the GraphExecutor's main loop.

  • It takes cryptographic snapshots of the active tool_catalogue, schema boundaries, and policy bindings every N steps.
  • If the agent attempts to load an unauthorized tool (e.g., a newly discovered bash executor), the DriftDetector triggers an immediate alert and can automatically halt the graph, ensuring the system never drifts outside its legally assessed envelope.

5. Transparency & Content Marking (Art. 13 & 50)

The Problem: Article 50 mandates that affected third parties must be informed when interacting with an AI, and any synthetic content (text, image, audio) must be marked as artificially generated.

The Lár Solution: Transparency Engine & Synthetic Marking

  • TransparencyEngine: Attached to ToolNodes, this primitive evaluates whether a tool (e.g., a send_email function) affects external natural persons, automatically appending regulatory disclosure notices to the outgoing payload.
  • SyntheticMarkerNode: A dedicated node that safely modifies final generation strings or state payloads by appending visible AI disclaimers or injecting simulated machine-readable metadata (e.g., C2PA manifests).

6. Data Minimization & Right to Erasure (GDPR Art. 5, 17)

The Problem: Cryptographically immutable audit logs are a regulatory requirement (Art 12), but they inherently conflict with GDPR's Article 17 "Right to Erasure". If an immutable log contains PII, you cannot delete the PII without destroying the mathematical integrity of the entire audit trail.

The Lár Solution: Pre-Hash PII Redaction Lár's PIIRedactionEngine acts as a middleware layer inside the AuditLogger.

  • Before the state payload is serialized and signed via HMAC-SHA256, the redactor recursively cleans specified sensitive keys (e.g., email, ssn, health_data).
  • The log is then cryptographically signed after redaction. This ensures complete audit integrity while maintaining 100% GDPR compliance.

7. Systemic Risks in Complex Topologies (BatchNode & DynamicNode)

The Problem: The paper identifies two specific failure modes in complex agentic topologies that no current framework handles:

  1. State Poisoning in Parallel Execution — When multiple sub-agents run concurrently, a hallucinating agent can overwrite shared state, corrupting downstream decisions before any audit trail captures the contamination.
  2. Runtime Self-Modification (Art. 3(23)) — When an agent rewrites its own tool catalogue or spawns new sub-graphs at runtime, it constitutes a "Substantial Modification" that legally voids the system's CE marking and requires a full new conformity assessment.

BatchNode — Parallel Execution with State Isolation

The Lár Solution:

BatchNode executes multiple nodes in parallel threads with deep-copied, perfectly isolated GraphState objects. Each thread gets its own snapshot of state; it cannot read or corrupt the other threads' memory.

from lar import BatchNode, LLMNode

# Each analyser runs in its own isolated state clone
clinical_analyser    = LLMNode(model_name="ollama/phi4", prompt_template="Analyse BP: {case_summary}", output_key="bp_analysis")
drug_interaction_bot = LLMNode(model_name="ollama/phi4", prompt_template="Check interactions: {case_summary}", output_key="drug_check")

parallel_review = BatchNode(
    nodes=[clinical_analyser, drug_interaction_bot],
    next_node=merge_node
)

What this provides for compliance:

Compliance Concern BatchNode Behaviour
State poisoning Each thread receives a copy.deepcopy() of state at fork time. One thread's hallucination cannot affect another.
Merge auditability Only keys that differ from the baseline are written back. The executor captures every merged key as a state_diff entry in the causal trace.
Token budget reconciliation Each thread's token spend is summed and deducted from the shared budget atomically, preventing budget overruns from parallel spend.
Loop protection Each thread has a MAX_STEPS=50 internal brake. If any branch exceeds it, a WARN is written to state and the thread exits cleanly.

What the Causal Trace shows for a BatchNode step:

{
  "step": 2,
  "node": "BatchNode",
  "state_diff": {
    "added": {
      "bp_analysis": "Blood pressure is hypertensive stage II...",
      "drug_check": "No critical interactions found with metformin..."
    },
    "updated": {
      "token_budget": 850
    }
  },
  "outcome": "success"
}
An auditor can see exactly which branch wrote which key and that the token budget was updated — all in one step entry.


DynamicNode — Runtime Self-Modification with Art. 3(23) Safety Rails

The Lár Solution:

DynamicNode is a metacognitive primitive that asks an LLM to design a subgraph at runtime using a JSON spec. Before executing a single node from that spec, it runs the TopologyValidator — a static analysis layer that enforces three hard constraints:

  1. Cycle detection — DFS traversal of the generated graph to block infinite loops.
  2. Tool allowlist enforcement — Any ToolNode in the spec must be in a pre-approved allowlist. The LLM cannot inject an unapproved tool.
  3. Structural integrity — Every next pointer must reference an existing node in the spec. Dangling references are blocked.
from lar.dynamic import DynamicNode, TopologyValidator

# Only these tools can appear in LLM-generated subgraphs
validator = TopologyValidator(allowed_tools=[fetch_drug_db, write_prescription])

dynamic_step = DynamicNode(
    llm_model="ollama/phi4:latest",
    prompt_template="Design a subgraph to assess {case_summary}",
    validator=validator,          # Enforces the allowlist + cycle check
    next_node=audit_node          # Returns here after subgraph finishes
)

What this provides for compliance:

Compliance Concern DynamicNode Behaviour
Art. 3(23) CE-marking TopologyValidator blocks any tool not in the pre-declared allowlist. The legal boundary of the system cannot expand at runtime without a code change (which triggers a new conformity assessment).
Audit of self-modification The __graph_spec_json__ key is written to state (and captured by state_diff) before the subgraph executes. An auditor can read the exact JSON the LLM proposed and which nodes were actually wired.
Fallback on rejection If TopologyValidator rejects the spec, DynamicNode falls through to next_node and writes a REJECTED reason to state — never silently skipping.
Fractal safety inheritance Recursive DynamicNode children inherit the same TopologyValidator instance as the parent. Safety rails cannot be escaped through nested metacognition.

What the Causal Trace shows for a DynamicNode step:

{
  "step": 1,
  "node": "DynamicNode",
  "state_diff": {
    "added": {
      "__graph_spec_json__": "{\"nodes\": [{\"id\": \"n1\", \"type\": \"LLMNode\", ...}], \"entry_point\": \"n1\"}"
    }
  },
  "outcome": "success"
}
The proposed topology is permanently embedded in the causal trace before it executes. Regulators can verify the agent never operated outside its declared legal envelope.

ComplianceManifestGenerator note: Static pre-execution traversal flags every DynamicNode with a HIGH severity warning — reminding providers that this node requires explicit CE-marking documentation before deployment in a high-risk system.


8. The Foundational Task: Exhaustive Action Inventory (Step 9)

The Problem: The paper's central conclusion states plainly: "The regulatory trigger for each legislative instrument is determined by the agent's external actions, not its internal architecture. The provider's foundational compliance task is therefore an exhaustive inventory of the agent's actions, data flows, connected systems, and affected persons. That inventory is the regulatory map."

Every framework requires this inventory, yet no framework provides tooling to generate it. Developers are left to manually maintain compliance spreadsheets that instantly go stale whenever a node is added or a tool is changed.

The Lár Solution: ComplianceManifestGenerator Lár provides a built-in static graph analysis utility that automatically generates the regulatory map required by Step 9 of the 12-step compliance sequence.

from lar.compliance import ComplianceManifestGenerator

# Point it at your graph's entry node — no execution required
manifest = ComplianceManifestGenerator(
    start_node=entry_node,
    system_name="Customer Service Agent v1.0"
)

# Save a machine-readable JSON for your notified body
manifest.save("compliance_manifest.json")

# Print a human-readable Markdown report for your compliance officer
print(manifest.as_markdown())

The generator statically traverses the entire graph and produces a structured report containing:

  • Every ToolNode (external action): function name, module, input/output keys, whether it affects third parties, and whether a CredentialVault is attached.
  • Every LLMNode: model name, prompt template preview, fallback configuration.
  • Every RouterNode: branching logic and routes, flagging if none lead to a HumanJuryNode.
  • Every BatchNode: number of parallel branches and their types.
  • Every DynamicNode: flagged as a potential Art. 3(23) substantial modification risk.
  • Automated Risk Flags: HIGH/MEDIUM severity flags for missing CredentialVaults, unguarded DynamicNodes, and unacknowledged Art. 50 third-party obligations.

See the full example: examples/compliance/20_compliance_manifest.py


9. The Lethal Trifecta (AEPD Rule of 2)

The Problem: In February 2026, the Spanish Data Protection Authority (AEPD) published the first EU supervisory authority guidance treating the agentic architecture itself — not just its outputs — as the primary object of data protection analysis. The paper identifies the core ruling (Section 7.2): "An agent should not simultaneously combine all three of the following without human oversight — (1) processing untrusted input, (2) accessing sensitive data, and (3) taking autonomous action affecting individuals."

This "lethal trifecta" from Simon Willison / Meta's security framework is now a GDPR-grounded governance criterion. No prior framework has a runtime check for it.

The Lár Solution: LethalTrifectaGuard A runtime pre-execution guard that evaluates all three legs of the trifecta against the current GraphState before any ToolNode fires.

from lar.compliance import LethalTrifectaGuard, LethalTrifectaError

guard = LethalTrifectaGuard(
    untrusted_input_fn=lambda s: s.get("user_query") is not None,
    sensitive_data_fn=lambda s: s.get("patient_health_data") is not None,
    autonomous_action_fn=lambda s: True,  # ToolNode always modifies external state
    human_approval_state_key="jury_decision",
)

# Raises LethalTrifectaError if all 3 are active without prior HumanJury approval
guard.check(state, action_label="update_patient_record")

If all three conditions are active and no HumanJuryNode has run upstream (checked via human_approval_state_key), the guard raises LethalTrifectaError with an explicit remediation message and writes the full trifecta evaluation to the state for the Causal Audit Trail.

See the full example: examples/compliance/21_authority_and_trifecta.py


10. The Fourth Tier (Action-Level Authority Records)

The Problem: Section 9, Finding (10) of the paper is the harshest compliance critique of the entire AI governance tooling market:

"The governance tooling market currently comprises three functional tiers: governance platforms, runtime enforcement, and information governance. A fourth tier is absent: infrastructure governing human-agent interaction at runtime... maintaining an immutable oversight record. The absence of this tier is not only a market gap: it is a compliance gap. The essential requirements in Articles 12–14 impose obligations that can only be demonstrated through action-level records of human authority exercise, not through system-level documentation of oversight design."

The HumanJuryNode was a blocking interrupt. That was tier-3 (binary policy check). To close the gap, it must also capture who exercised authority, in what role, with what rationale, at what risk score, at what timestamp — and make that record cryptographically tamper-evident.

The Lár Solution: AuthorityLedger + Upgraded HumanJuryNode

from lar.compliance import AuthorityLedger
from lar import HumanJuryNode

ledger = AuthorityLedger(hmac_secret="your-secret")

jury_node = HumanJuryNode(
    prompt="AI proposes: Hypertension diagnosis. Approve?",
    choices=["approve", "reject"],
    output_key="jury_decision",
    context_keys=["ai_diagnosis", "patient_id"],
    next_node=next_step,
    # --- Fourth Tier fields ---
    authority_ledger=ledger,
    stakeholder_id="[email protected]",
    stakeholder_role="Attending Physician",
    action_description="AI-proposed diagnosis — update patient record",
    risk_score_key="risk_score",  # Pulls from RiskScorerNode output
)

Every human decision now produces an AuthorityRecord containing: - Stakeholder identity and role - The exact AI-proposed action under review
- The risk score from the upstream RiskScorerNode - The human's decision and their stated rationale (prompted at runtime) - A UTC timestamp - A snapshot of the relevant context keys

The ledger is then saved as a HMAC-SHA256 signed JSON file alongside the AuditLogger's Causal Trace — completing the full evidence chain from action proposal → risk assessment → human determination → execution outcome required by Articles 12–14.

See the full example: examples/compliance/21_authority_and_trifecta.py