Dynamics Blindness: When AI Is Locally Correct and Globally Non-Compliant

A single change, three hops

One permission change. A breach the agent never saw.

An AI agent inside an investment bank gets a routine request from the Head of Compliance: widen a senior trader's access so she can review her own desk's exposure. One flag changes in the access-management system. The agent reads the request, makes the change, and returns a clean confirmation. Locally, the action is correct.

Then the cascade runs.

Action · locally correct

Permission widened

A senior trader's access is extended so she can review her desk's exposure. One flag changes.

→

Hop 1

Data-bus exposure

The risk module shares a data bus with order management. She can now see the pre-trade limits her own orders are checked against.

→

Hop 2

Control breach

Execution authority plus visibility into the risk parameters that govern it violates segregation of duties.

→

Hop 3 · breach

Conflict-of-interest duties

The conflict engages MiFID II duties (Art. 16(3), Art. 23) - management, escalation, and record-keeping the firm cannot discharge for a conflict it never detected.

Each hop is correct in isolation. The breach is three systems away, in dependencies the agent has no model of.

The agent understood the instruction. It performed the action. It wrote a grammatical confirmation. It was blind to every consequence that followed.

Change the industry and the shape holds. In insurance, a claims approval trips a reserve adjustment that breaches a reinsurance treaty threshold, and the agent never sees the 48-hour notice it just triggered two systems away.

An architectural failure mode, not a bug

LLMs process tokens. Organisations run on causation.

Dynamics blindness

The inability to trace how a change propagates through interconnected systems, and to reason about what happens when an action meets the rules already in force. A property of the model itself; training alone does not remove it.

A language model predicts the next token from the sequence in front of it. It does that well enough to identify the user, parse the request, and update the permission correctly. What it cannot do is hold a model of the bank's dependencies and trace a chain from permission granted, through data-bus exposure, through role conflict, to the regulatory duties it triggers. That chain is causal reasoning - A enables B, B under condition C produces D - and token prediction does not perform it.

Dynamics blindness is not hallucination. Hallucination is being wrong about a fact, and better grounding reduces it. Dynamics blindness sits on a different axis: the agent can be factually correct at every step and still miss the cascade. Put the segregation-of-duties policy in its context window and the model can recite it on request. The model has the policy. What it lacks is any mechanism for tracing the live chain that makes the policy bite.

This is a structural blind spot, not a performance gap. Every weld passes inspection; the bridge still comes down, because no one holds the drawing of the whole structure. It does not close with the next model release.

What it costs

In regulated environments, locally correct is not good enough.

The cost is already visible at industry scale, even if no single figure isolates this failure mode. 42% of companies scrapped most of their AI initiatives, up from 17% a year earlier (S&P Global Market Intelligence). Gartner expects more than 40% of agentic AI projects to be cancelled by the end of 2027. About 6% of organisations qualify as AI high performers (McKinsey). Those failures have many causes - cost, data quality, unclear return - but in regulated settings, where a locally correct action can still breach a rule, dynamics blindness is a prime and under-diagnosed one. The exposure concentrates where decisions carry legal weight - financial services, healthcare, insurance, critical infrastructure.

In a consumer app, a locally-correct, globally-wrong answer is a bad experience; the user asks again. In a regulated institution, the same answer is a recordable event. A decision that looks right in its immediate context violates a rule operating three or four hops away, and the violation is invisible to the system that caused it. Regulators do not grade on averages. A system that produces the right causal chain 90% of the time is still non-compliant, because no one can say in advance which decisions fall in the other 10%.

Locally correct. Globally non-compliant. The agent executed perfectly, and never saw the breach.

Why the usual fixes miss

More scale, more retrieval, more agents - none of it adds causal infrastructure.

The reflex is to reach for the next upgrade. None of the common ones close the gap.

Scale and emergence. Every capability that has emerged at scale is pattern recognition over the training distribution. Your dependency graph at this moment - who holds which position, what the exposure is, which feeds are live - is not a pattern in any corpus. It is the state of a live system. Scaling does not hand a model facts it never saw.

Retrieval and GraphRAG. Better retrieval gives the model better inputs. It does not change the mechanism underneath, which is still token prediction. An agent that traces the chain correctly 95% of the time still cannot tell you which 5% it missed - and in a regulated decision, that is the whole question.

More agents. If each agent is dynamics-blind, the swarm is too. One agent changes the permission, another watches compliance, a third files the report, and none of them holds the graph that connects their domains. More agents multiply action without adding awareness - faster cascades through the same unmapped dependencies. This is the Gartner forecast made concrete.

Underneath all three is a distinction regulators already enforce: the difference between a plausible-looking answer and a chain of reasoning an independent reviewer can reproduce. SR 11-7 calls it conceptual soundness. The EU AI Act calls it traceability. Probabilistic correctness and guaranteed auditability come from different architectures.

The resolution is architectural

Govern what the agent may act on, before it acts.

If the blindness is architectural, so is the resolution. The agent needs a causal model it can consult as a service - infrastructure beneath it that traces consequences through the organisation's dependencies and tests whether an action is admissible before it executes: deterministically where the rules are determinate, and escalating where the evidence is incomplete.

That substrate is the Knowledge Layer - the layer between the data governance below and the agent action above. It holds organisational knowledge as claims, each carrying provenance, decay, and a computed standing on a 0-100 spectrum rather than a binary true-or-false. Four jobs maintain it: Capture and Structure build the claim graph; Govern and Serve make it admissible. Govern runs across the other three as a continuous layer.

The mechanism that meets the trading-floor case is the epistemic circuit breaker. It sits at the boundary between assembled context and the agent's action. Before the agent acts, it checks whether the claims in the reasoning chain clear the minimum standing for that action's consequence tier, under the required authority - a regulatory filing demands more than an internal lookup. If the chain falls short, the action is blocked and the system returns a structured escalation: which claims caused the halt, what is missing, what was stopped. It fails loud. The permission change never reaches the data bus.

Agent action

Players acting under supervision - where the consequences land

Epistemic circuit breaker standing vs consequence, before the act

The Knowledge Layer

Governed claims · provenance · decay · 0-100 standing · four jobs (Capture, Structure, Govern, Serve)

Data governance

Deterministic records · lineage · controls - the substrate the firm already runs

The governed middle the diagnosis points to. The circuit breaker sits at its top edge: before an agent acts, it checks that the claims behind the action clear the standing the consequence demands.

None of these pieces is exotic. A decision point that gates an enforcement point is how access control has worked for two decades; a claim graph is a knowledge graph held to a higher standard. What is new is the unit and the test - claims that carry provenance, decay, and a computed standing, and a check that runs before the action and asks whether the evidence behind it is strong enough for the consequence at stake. Retrieval asks what is relevant; this asks what is admissible.

None of it works unless the dependencies are written down - the data-bus link, the segregation rule, the conflict trigger have to exist as governed claims the substrate can traverse. Regulated firms already hold the raw material: the data lineage BCBS 239 demands, the dependency maps DORA mandates, the role models behind segregation of duties. But they hold it as static audit documentation - fragmented, point-in-time, built to be read rather than traversed. Making it computable, and keeping it current, is the core work - a programme in its own right. That is the work the Knowledge Layer organises, and where dynamics blindness stops being a diagnosis and becomes the entry point to Governed Intelligence.

Where to start

Read the diagnosis, then the architecture.

The full case for dynamics blindness, the architecture it points to, and the programme behind both.

Paper A · The diagnosis in full

Dynamics Blindness: When AI Is Locally Correct and Globally Non-Compliant

The working paper this page distils. Names and formalises the failure mode, the empirical evidence from neuro-symbolic and process-mining research, and the architectural properties any fix must have.

Reichhart, W. & Gelas, A. (2026)

Read on SSRN → · Download PDF ↓

The architecture

Governed Intelligence

The architecture the diagnosis points to, and the staged journey from operational visibility to governed agent operations - where the Knowledge Layer and the epistemic circuit breaker live.

governedintelligence.eu →

The programme

Five papers, five manifestos

How the diagnosis resolves - architecture, theoretical foundations, methodology, and the move from autonomy to initiative. Co-authored with Arnaud Gelas. Open access on SSRN.

All research →