The Regulatory Shift Banks Underestimate
SS1/23 is not a guidance document. It is a supervisory statement, which means the Bank of England expects regulated banks to operate consistent with it. The first principle requires firms to have an established definition of a model, an accurate model inventory, and a tiering of model risk. AI and machine learning models are explicitly in scope.
The shift most banks underestimate is that an AI agent is not a single model. A production agent stack typically contains the licensed LLM, a planning layer, a memory store, a tool-calling layer, and sub-agents that themselves call further tools. Under SS1/23 each of these is a model element that has to be inventoried, tiered, validated, and monitored. The inventory line a CRO submits to the model risk committee that says "we use a large language model for credit memo drafting" describes one tier of a stack that may have five or six.
By 2026 supervisors will not accept abstract attestations that AI risks are managed. They will ask for the inventory, the validation evidence, the monitoring data, the human oversight records, and the change log. Each will need to be tied to the live runtime behaviour of the agent that produced the decision under review.
Four Runtime Primitives Validation Demands
SS1/23 distinguishes between attestation grade evidence (a dashboard, a documented policy) and validation grade evidence (independently verifiable records of what actually happened at runtime). Most AI governance platforms produce the former. The four primitives below produce the latter.
Cryptographically signed agent action records
Every action an agent takes (prompt, tool call, output, refusal) is signed, sequenced, and anchored to a public ledger. The chain cannot be altered after the fact and a supervisor can verify the record independently of the vendor that produced it. This is the difference between trusting a vendor log and verifying a mathematical proof.
Human delegation provenance
When a human authorises an agent to take an action on their behalf, the delegation is captured as a signed token: what authority, to which agent, for which task, under what policy, with what expiry. When the agent sub-delegates to a tool or another agent, the chain is captured recursively. This is the SS1/23 human oversight record at validation grade.
Knowledge source controls and access logs
Every knowledge source an agent could access (internal documents, regulatory standards, web search, customer data) is logged at the policy layer with what was permitted and what was actually accessed. SS1/23 paragraph 3.3 expects data provenance for model outputs. Knowledge access logs make that provenance auditable.
Selective disclosure for supervisor sharing
Supervisors do not need raw customer data. They need cryptographic proof that controls operated correctly. BBS+ selective disclosure produces evidence packs that prove specific facts (this PII field was redacted, this human authorised this action, this knowledge source was disabled at the time) without exposing the underlying data. The supervisor verifies the proof against the public ledger.
Mapping the Five Principles to Runtime Evidence
SS1/23 organises model risk management around five principles. Each one has a runtime evidence requirement that the four primitives above can satisfy.
Principle 1: Model identification and inventory
A complete inventory of every model element in the agent stack, including sub-agents and tool calls, with tiering by risk. Runtime primitives provide the signed identity for each agent and the signed record of every action it produced.
Principle 2: Governance
Clear roles, responsibilities, and accountability across the model lifecycle. Human delegation provenance captures who authorised what, to which agent, under what policy. This is the governance record at validation grade.
Principle 3: Model development, implementation, and use
Data quality, model design choices, and use restrictions documented and enforced at runtime. Knowledge source controls produce the data provenance evidence supervisors expect. Use restrictions are enforced by the policy layer and logged per request.
Principle 4: Independent model validation
Independent challenge that does not rely on the vendor or the development team. Cryptographic signatures on the public ledger let validators verify the record without trusting our logs. Selective disclosure means they get the evidence they need without raw data exposure.
Principle 5: Model risk mitigants
Documented mitigants for residual risk, with monitoring that detects drift or breach. The runtime audit trail produces the change log automatically. Selective disclosure proof packs evidence that mitigants operated correctly when challenged.
UK Patent Application GB2604344.8, filed 27 February 2026, covers the cryptographic disclosure provenance architecture underlying these primitives. The same architecture supports EU AI Act Article 12 logging and Article 14 human oversight obligations, giving a single deployment dual coverage.
Where to Start
The implementation pattern that works is narrow scope first, then scale. Pick one high-risk workflow (credit memo drafting, customer correspondence, AI-assisted underwriting) and instrument it end-to-end with the four primitives. Produce a working SS1/23 proof pack from the runtime data within weeks.
Once the model risk committee accepts the proof pack on a narrow scope, the same instrumentation scales across the agent estate. The cost of getting started is measured in weeks of focused work, not 6-month consulting projects. SS1/23 evidence is a feature of the runtime architecture, not a deliverable from a separate compliance programme.
What is not viable is to wait. The Bank of England has indicated that supervisory reviews of AI in 2026 will focus on banks that can demonstrate validation grade evidence and the ones that cannot. The expectation has moved from "we are working on it" to "we have it running today on this workflow".
Frequently Asked Questions
What is PRA SS1/23?
PRA SS1/23 is the Bank of England Prudential Regulation Authority supervisory statement on model risk management principles for banks. It elevates model risk to a primary risk discipline alongside credit and market risk and explicitly brings AI and machine learning models into scope. The five principles cover identification, governance, development and use, validation, and mitigants. Banks must comply by 17 May 2024, with full implementation expected through 2026.
Does SS1/23 apply to AI agents specifically?
Yes. SS1/23 paragraph 1.5 defines a model broadly to include any quantitative method, system, or approach that uses statistical, economic, financial, or mathematical theories. AI and machine learning models are explicitly in scope. As banks deploy autonomous AI agents that chain reasoning steps, call tools, and act on customer data, those agents become models under SS1/23 and inherit the full set of model risk obligations. Sub-agents and tool calls add layers of risk that supervisors expect to be inventoried, documented, and monitored.
What evidence does a supervisor actually want?
Validation-grade evidence, not attestation. Attestation grade is a dashboard saying compliance is green. Validation grade is signed, timestamped, tamper-evident records that an auditor can verify independently. SS1/23 supervisory teams want to see model inventories that are complete, model development and validation documentation that ties to live runtime behaviour, evidence of human oversight at the moments it mattered, and the data lineage that produced a given output. Most AI governance platforms produce attestation; few produce validation.
What primitives close the gap between dashboards and validation?
Four primitives, applied at the runtime layer. First, cryptographic signatures on every agent action, sequenced and anchored to a public ledger so the chain cannot be altered after the fact. Second, human delegation provenance that captures who authorised what authority, to which agent, for which task, under what policy. Third, knowledge source controls that log what an agent could access and what it actually accessed. Fourth, selective disclosure so banks can share targeted evidence with supervisors without exposing customer data. Together these turn model risk reports into mathematically verifiable artefacts.
What about the EU AI Act overlap?
EU AI Act and SS1/23 overlap meaningfully for credit-related and customer-decisioning AI. EU AI Act Article 6 and Annex III classify credit scoring as high-risk. Articles 8 to 15 then prescribe a risk management system, data governance, technical documentation, logging, transparency, human oversight, accuracy, robustness, and security. The runtime primitives that satisfy SS1/23 also satisfy Article 12 logging and Article 14 human oversight. A bank that builds for SS1/23 with the right primitives gets substantial EU AI Act coverage in the same architecture.
Where do regulated banks start?
Start with a single high-risk workflow, instrument it end-to-end with the four primitives, and produce a working SS1/23 proof pack from real runtime data. Credit memo drafting, AI-assisted underwriting, and customer correspondence are typical first workflows. The aim is to demonstrate validation-grade evidence to your model risk committee on a narrow scope, then scale the same instrumentation across the agent estate. Self-service deployment matters because SS1/23 evidence cannot wait on 6-month consulting projects.