ADR-0089: Deterministic compliance scoring + entity-risk one-way ratchet + fail-closed re-screen
Date: 2026-07-03 Status: Accepted Deciders: Adrian (Soft4U), Claude Opus 4.8
Related: ADR-0020 (EBA weighted-max risk matrix), ADR-0066 (live risk-paced UBO re-screening / entity baselines), ADR-0067 (fail-closed / never report clear — the "not assessed" contract), ADR-0070 (maker-checker / four-eyes), ADR-0075 (document-content adverse analysis — docs can RAISE risk), ADR-0082 (post-validation calibration & integrity hardening — the recalc never-suppress guard, the same class of defect at the case layer), PR #176 (display-risk one-way ratchet reconcile_display_risk)
ADR-number note: the design (
docs/superpowers/specs/2026-07-03-determinism-risk-ratchet-design.md§7) and plan reserved "ADR-0083" as a placeholder; between authoring and implementation the AMLA monitoring-framework work landed and consumed ADR-0083 through ADR-0088. Per the spec's explicit instruction ("reserve next free number at authoring time"), this decision is recorded as ADR-0089 — the next free number — to keep the register append-only and non-colliding.
Context
A live OB Holding 1 OÜ (Estonian PSP, registrikood 14975047) re-run assessed the same
legal entity on identical inputs as CRITICAL/90 on one run (case_d9d389cf06bb, 4
criminal/EPPO findings surfaced → composite floored to 90) and medium/51 on another
(case_a69b80f57d51, 1 finding, too weak to floor). For a compliance engine that is
unacceptable on its own; worse, the lower assessment silently overwrote the entity's
persisted baseline.
Root causes (evidence-audited, systematic-debugging Phase 1–2):
- Baseline upsert blindly overwrites.
entity_baseline_service.upsert_baselineused anon_conflict_do_updatethat setlatest_risk_score/latest_risk_tierunconditionally — no floor, no comparison to the established value. A lower-recall re-screen silently downgraded an entity's persistent risk. This directly violates the project's cardinal principle (mission memory, ADR-0067): the system may ADD scrutiny but must NEVER suppress a risk signal. - Scoring agents sample.
synthesis_agentran attemperature=0.1(mcc=0.1, case_intelligence=0.2, memo=0.3, finding_debugger=0.1, task_generator=0.1, belgian=0.1). Even on identical evidence the score could drift across runs. - No fail-closed on a retrieval gap. When adverse-media/SERP data-gaps (the criminal article is not retrieved) the score dropped without a hard floor or a prominent "not assessed" signal — a false-negative on a criminal finding presented as a confident medium.
The inherent limit is honest: raw live-web retrieval recall (BrightData/Tavily/Google) cannot be made deterministic. This decision does not pretend otherwise — it removes the two axes that have no excuse (LLM sampling, blind persistence) and makes retrieval gaps fail closed so a recall miss can never lower an entity's risk.
Decision
Enforce one invariant:
Per-entity risk is monotonic absent an explicit, audited downgrade. A re-screen may raise or maintain an entity's risk; it may never silently lower it. The only path down is an officer decision through maker-checker (ADR-0070), recorded in
audit_events.
Five components enforce it:
A — Deterministic scoring (temperature=0). Pin every agent whose output is persisted or
shown to a regulator to temperature=0 (greedy decoding): synthesis_agent, mcc_classifier,
case_intelligence_agent, memo_justification_agent, finding_debugger, task_generator,
belgian_agent. dashboard_agent stays at 0.7 (interactive chat UX, no persisted verdict).
A guard unit test asserts each compliance-output agent's ModelSettings.temperature == 0 so a
future edit cannot silently re-introduce sampling. Anthropic/OpenAI honour 0 as greedy
decoding; that is sufficient — the residual is documented, not hidden.
B — Entity-risk one-way ratchet (core). Mirror reconcile_display_risk (PR #176, which
ratchets a single case's display risk up) at the persistent baseline layer. A new pure
function reconcile_baseline_risk(established, incoming) -> BaselineDecision compares tier
rank (critical>high>medium>low>clear) then score: on raise-or-maintain the incoming value
becomes effective; on a would-be downgrade the established value is held, the incoming raw
run is recorded in last_run_risk_*, and a divergence_state payload is stored (pending
audited downgrade). upsert_baseline calls this before writing; the on_conflict_do_update
set_ applies the reconciled decision, never the raw incoming values. next_review is
computed from the effective tier — a downgrade can never loosen review cadence either.
The existing latest_risk_score/latest_risk_tier columns are the effective/established
value (one source of truth, no reader migration).
C — Material-findings persistence & re-injection. (Component; scheduled for a later task
in this plan.) Material findings (criminal/enforcement/sanctions/adverse_media/freeze/
regulatory_action) are fingerprinted (stable hash over type + subject + normalized claim),
persisted on the baseline (established_findings, union-only, never subtract) and re-injected
into a re-screen that misses them, so the established floor re-applies.
D — Fail-closed on material-check data-gap. (Component; later task.) A data-gapped
material check sets a structured material_check_incomplete flag, penalises confidence, renders
the ADR-0067 "not assessed" banner, and hard-floors the reconciled risk — a data-gap can never
enable a downgrade.
E — Audited downgrade path. (Component; later task.) The only way risk goes down: a
MonitoringAlert(trigger_type=risk_divergence, priority=high) is raised on Component-B
divergence (append-only enum member, monitoring_alerts.trigger_type is String(50) — no DB
migration for the member). Lowering the baseline requires an explicit officer decision routed
through maker-checker (ADR-0070): a second, different approver; audit_events records
risk_downgrade_approved with maker, checker, and reason.
Schema (Alembic migration 080, additive, down_revision 079). Five columns on
entity_baselines, all nullable/defaulted (safe online, no backfill — latest_* already holds
the effective value): last_run_risk_score (Integer), last_run_risk_tier (String),
established_findings (JSONB, default []), divergence_state (JSONB, nullable),
material_check_incomplete (Boolean, default false).
Decision context:
- Latency:
temperature=0changes decoding, not call cost — no measurable p50/p95 change. The ratchet adds one indexedSELECTon the baseline identity perupsert_baseline(already inside the same tenant session) — sub-millisecond, off the request hot path (runs in the Temporal activity). - Dependency surface: zero new packages. Reuses pydantic-ai
ModelSettings, SQLAlchemy async, the existingMonitoringAlertServiceandmaker_checkermodules. - Debuggability: a downgrade attempt leaves a
divergence_staterow + a RISK_DIVERGENCE alert + (on approval) an immutableaudit_eventsentry — a full forensic trail of every attempted and every approved downgrade, at 3am, from SQL alone. Thelast_run_*columns show the raw run that diverged from the held floor. - Reversibility: Component A is a one-line-per-agent config flip. Component B is additive
columns + a pure function wired into one write path; the migration
downgrade()drops the 5 columns. No data is destroyed by rollback (the effective value was always inlatest_*). - Blast radius: additive. Every existing reader of
latest_risk_score/tierkeeps working unchanged and simply gains the ratchet guarantee. The only behavioural change is that a re-screen can no longer lower a persisted baseline without audit. - Alternative considered: score-floor-only (no findings persistence) — rejected because the actual miss is a dropped finding, not just a lower number; a coarse floor would hold the tier but lose the specific EPPO finding an officer must see (Component C addresses this).
Consequences
Positive
- The CRITICAL→medium flip on identical inputs becomes structurally impossible: a re-screen can only surface same-or-higher risk, or raise an auditable divergence for a two-eyes human downgrade.
- Scoring is deterministic (temp 0): identical evidence → identical score/tier, so calibration regressions are reproducible instead of intermittent.
- One source of truth for "the entity's risk" (
latest_*= effective) — no reader migration, no secondestablished_*column to keep in sync. - Every downgrade is attributable (maker + checker + reason in immutable
audit_events), satisfying AMLR 5-yr audit-trail and EU AI Act Art. 12 logging obligations.
Negative
- Over-scrutiny bias. A held floor means a genuinely improved entity keeps its higher tier until an officer clears the divergence — deliberate (over-scrutiny is safe; silent under-scrutiny is the regulatory failure) but it adds officer workload (a RISK_DIVERGENCE task per downward re-screen).
- Downgrade friction. Lowering risk now requires two distinct approvers; a single officer can no longer correct a stale-high baseline alone.
- Extra columns + write complexity. Five new columns and a read-before-write in the upsert
path;
upsert_baselineis no longer a pure blind insert. temperature=0is not bit-level determinism across provider infra/model-version changes — it removes sampling, not every source of drift. Documented as a residual, not eliminated.
Neutral
dashboard_agentis intentionally excluded (interactive chat, no persisted/regulator-facing verdict) and stays at0.7.- Components C/D/E are recorded here as the committed design but land in later tasks of the same plan; A and B land first.
Alternatives Considered
Alternative 1: Do nothing (accept the divergence as retrieval noise)
- Treat the CRITICAL↔medium flip as an inherent property of non-deterministic web retrieval and document it as a known limitation.
- Why rejected: two of the three root causes (LLM sampling, blind baseline overwrite) are not retrieval noise — they are fixable engine defects. Leaving a lower-recall run to silently overwrite a CRITICAL baseline is a direct, repeatable violation of the never-suppress principle and an AML false-negative.
Alternative 2: Make retrieval deterministic (cache/pin the web corpus)
- Snapshot every source and replay it so re-screens are byte-identical.
- Why rejected: impossible in practice and wrong for perpetual KYC — the whole point of a re-screen is to observe new real-world signals. A frozen corpus would defeat monitoring. The honest fix is persistence + fail-closed, not fake determinism.
Alternative 3: Score-floor only (ratchet the number, drop finding persistence)
- Hold
latest_risk_scoreat its max but re-derive findings fresh each run. - Why rejected: the floor holds the tier but the specific material finding (the EPPO criminal investigation) still vanishes from the re-screen's finding list, so an officer reading the latest run sees a medium-looking finding set behind a critical number — an internal contradiction. Component C (persist + re-inject the finding itself) is required for a coherent, regulator-defensible record.