Skip to main content

ADR-0083: Event-trigger taxonomy, routing, and the periodic-review engine

Date: 2026-07-03 Status: Accepted Deciders: Adrian (Soft4U), Claude Fable 5 (AMLA remediation design session)

Decision context:

  • Latency: the trigger router adds one classification pass + one monitoring_alerts INSERT per detection inside the already-scheduled ContinuousMonitoringWorkflow run — no new hot path. The sweep_due_reviews activity is one indexed query on entity_baselines.next_review_due per tenant run. Review-case spawn reuses the existing case-creation path (seconds, async, off any user request). Not measured because the monitoring workflow is batch/scheduled, not request-serving.
  • Dependency surface: no new packages. Two new services (trigger_router_service.py, periodic_review_service.py), two new enums in packages/trustrelay-models/src/trustrelay_models/monitoring.py, one canonical cadence module in the same file, four new columns on cases + writer columns on monitoring_alerts (one Alembic revision — next sequential number at implementation time), two new activities on the existing per-tenant Temporal schedule.
  • Debuggability: every routing decision persists a documented-reasoning record on the alert row plus detected_atrouted_atreview_opened_at timestamps (the "without undue delay" evidence). A spawned review case carries origin and review_of_case_id, so lineage is a single column read. All transitions land in immutable audit_events (ADR-0064).
  • Reversibility: additive — new enum members, new columns (nullable / defaulted), new activities appended to an existing workflow. Reverting removes writers; the monitoring_alerts read surfaces degrade back to today's honest empty state. The cadence unification (§2.2 of the architecture doc) is a refactor with value-preserving re-export shims; reverting restores the local tables. No workflow.patched() guards (zero customers).
  • Blast radius: ContinuousMonitoringWorkflow gains post-check activities; ComplianceCaseWorkflow gains three start parameters but no new states; decision_memorandum_service loses its private cadence table in favour of the canonical one (the memo's promised review date changes for HIGH/MEDIUM tiers — deliberately, that is the bug). Onboarding cases are untouched.
  • Alternative considered: extend MonitoringCheckType instead of a new trigger enum; resume the original Temporal workflow instead of spawning a review case; one Temporal cron per relationship. All rejected below.

Context

AMLA's draft guidelines on ongoing monitoring (public hearing 2026-07-02; consultation to 2026-09-03; final Q4 2026; NCA comply-or-explain Q1 2027) elaborate AMLR (EU) 2024/1624 Art. 26: periodic reviews at risk-based cadence within the Art. 26(2) ceilings, event-driven reviews on the Art. 26(3) trigger events "without undue delay", and an explicit rejection of "passive confirmation". The evidence-audited gap register (docs/research/2026-07-03-amla-ongoing-monitoring-gap-analysis.md, §3) verified, with file:line anchors, that Trust Relay has a production-grade detection spine (ADR-0066) but no response layer:

  1. Triggers exist only as a config constant. MANDATORY_EVENT_TRIGGERS (backend/app/services/monitoring_schedule_service.py:29-36) names five Art. 26(3) triggers — sanctions_list_update, ownership_change_above_25pct, pep_status_change, jurisdiction_change, adverse_media_critical — but the only code that reads it is validate_monitoring_config (:72-110), which stops tenants disabling them. Nothing detects them: there is no ownership diff, PEP re-screens always pass previous_state=None so new-vs-reconfirmed is indistinguishable (monitoring_check_service.py:506-643 — every MonitoringEvent constructed in check_ubo_screening hardcodes previous_state=None), there is no jurisdiction-change comparison, and check_company_status is a self-inflicted stub returning an indeterminate WARNING even for BE, where registry clients exist (monitoring_check_service.py:445-473).
  2. Detections die in a list. The material-change diff is real and fail-closed (monitoring_check_service.py:766-948: taxonomy-distinct entity-status/sanctions/PEP/adverse-media classification, indeterminate ⇒ WARNING never benign), but its output is a MonitoringEvent row with a bare boolean acknowledge. The monitoring_alerts table (db/models.py:3102) has zero writers; nothing reopens a case, routes to SAR, or notifies anyone. A CRITICAL sanctions hit on a monitored UBO becomes an unread list row.
  3. next_review_due is stored but inert. EntityBaseline.next_review_due is computed on every workflow completion (compliance_case.py:1053 area — upsert_entity_baseline fires before requirements review; entity_baseline_service.py:74-124) and ceiling-enforced (compute_next_review_date, entity_baseline_service.py:24-45), but nothing fires when it elapses. Its only consumers are an ORDER BY and a static memo field. There is no periodic full KYC review at all — only re-screening. This is exactly the passive confirmation AMLA rejects, and KBC gap #8 ("perpetual KYC").
  4. Three conflicting cadence tables (gap doc §3.2.4): decision_memorandum_service._NEXT_REVIEW_MONTHS_BY_TIER (:49, 6/12/36 months keyed HIGH/MEDIUM/LOW), entity_baseline_service._DEFAULT_CADENCE (:21, 12/24/36 months keyed EDD/CDD/SDD), and risk_matrix._CADENCE_DAYS_BY_TIER (packages/trustrelay-models/src/trustrelay_models/risk_matrix.py:22, 90/180/365 days re-screen). The memo promises the customer a review date computed from a different vocabulary and different values than what the (inert) baseline stores — the same dual-representation defect class as the risk-display bug fixed in PR #176.

This ADR is the "pKYC trigger model" the world-class roadmap's M4 sketch anticipated. It covers architecture doc (docs/superpowers/specs/2026-07-03-amla-remediation-architecture.md) §2.1 (trigger/response enums), §2.2 (canonical cadence), and wave W1; the cadence unification itself ships in W0 as an integrity fix.

Decision

1. A trigger taxonomy separate from the check taxonomy

Add MonitoringTriggerType and TriggerResponse to packages/trustrelay-models/src/trustrelay_models/monitoring.py, verbatim per architecture doc §2.1. MonitoringTriggerType has eleven members; the first five string values MUST stay byte-identical to MANDATORY_EVENT_TRIGGERS (monitoring_schedule_service.py:29-36) so existing tenant event_triggers config keys keep validating unchanged. The remaining six (company_status_change, document_expired, profile_deviation, verification_stale, review_due, cdd_nonresponse) name triggers this and later waves (W3–W5) detect. TriggerResponse is the three-way routing outcome: full_kyc_refresh (new review case, full 12-step loop including customer documents), targeted_update (new review case, investigation-only scope), record_only (alert row only).

Checks (MonitoringCheckTypehow we look: ubo_screening, material_change, …, trustrelay_models/monitoring.py:20-31) and triggers (what happened in Art. 26(3) vocabulary) are distinct axes: one check run can yield several triggers (a material-change diff can surface both company_status_change and adverse_media_critical), and one trigger can be fed by several checks. See Alternative 1 for why they are not merged.

2. One canonical cadence table (kills gap doc §3.2.4)

packages/trustrelay-models/src/trustrelay_models/monitoring.py becomes the single home:

REVIEW_CADENCE_MONTHS_BY_TIER = {"EDD": 12, "CDD": 24, "SDD": 36} # periodic full review
RESCREEN_CADENCE_DAYS_BY_TIER = {"EDD": 90, "CDD": 180, "SDD": 365} # moved from risk_matrix.py (re-export shim kept)
RISK_LEVEL_TO_TIER = {"CRITICAL": "EDD", "HIGH": "EDD", "MEDIUM": "CDD", "LOW": "SDD"}

Consumers converge (delete local tables): entity_baseline_service._DEFAULT_CADENCE (values already agree — becomes an import); decision_memorandum_service._NEXT_REVIEW_MONTHS_BY_TIER is deleted — the memo reads EntityBaseline.next_review_due for the case's entity, falling back to computing from the canonical table via RISK_LEVEL_TO_TIER; risk_matrix._CADENCE_DAYS_BY_TIER keeps its values but moves home, with a re-export shim in risk_matrix.py so cadence_days_for_tier callers are untouched. The AMLR Art. 26(2) ceilings (AMLR_MAX_CADENCE, monitoring_schedule_service.py:27, duplicated as _AMLR_MAX_MONTHS in entity_baseline_service.py:18) are unchanged and remain enforced in compute_next_review_date and validate_monitoring_config; the duplicate constant also converges on one import. Ships in W0.

3. Detection: turn the five mandatory triggers into code (W1)

  • Ownership diff — re-run the deterministic UBO engine against current registry data and compare with EntityBaseline.latest_investigation_result: a ≥25-percentage-point path-sum change for any beneficial owner, or a change in the UBO set, emits ownership_change_above_25pct.
  • PEP new-vs-reconfirmed — derive previous_state from the append-only screening_results history instead of the hardcoded None (monitoring_check_service.py:506/520/630): a PEP hit for a person whose prior complete screen was clean emits pep_status_change; a reconfirmed hit does not re-trigger (it is already dispositioned evidence, ADR-0063).
  • Jurisdiction change — compare the entity's current country/geography signals against FATF / EU high-risk-third-country reference data; a movement into (or listing change of) a high-risk jurisdiction emits jurisdiction_change.
  • Company status — wire check_company_status to the existing registry clients (app/services/registries), BE first, killing the self-inflicted stub at monitoring_check_service.py:445-473. Other countries remain honest capability-gaps per ADR-0068 (indeterminate WARNING, never a benign "no change" — the ADR-0067 contract the stub already honours). A transition into a terminal status emits company_status_change at CRITICAL (reusing is_terminal_status, ADR-0065).
  • sanctions_list_update and adverse_media_critical are fed by the existing check_ubo_screening re-screen and detect_material_changes diff (monitoring_check_service.py:479-643, 766-948) — detection exists; only the routing below is new.

4. Routing: trigger_router_service.py (W1)

Every detection flows detection-diff → MonitoringTriggerTypeTriggerResponse. The default mapping is severity-based and tenant-overridable upward only: mandatory triggers stay non-disableable (enforced by the existing validate_monitoring_config fail-closed check, monitoring_schedule_service.py:89-103), and the fail-closed floor cannot be lowered. The router writes a monitoring_alerts row (first-ever writer for the table — the dashboard badge stops polling a permanently empty table) carrying trigger_type, response_required, source_event_id, a documented-reasoning record for the routing decision, and the detected_at/routed_at timestamps; review_opened_at is stamped when a case spawns. For full_kyc_refresh / targeted_update on an EDD-tier relationship the router auto-opens a review case; on CDD/SDD it stops at the alert queue for officer disposition (settled: architecture doc §5.5 — EDD auto-open is a fail-closed floor and cannot be disabled). record_only writes the alert row and nothing else. Alert disposition (triage/close/SAR link) is ADR-0084 / W2; this ADR only guarantees no detection dies unrouted.

5. Periodic reviews: periodic_review_service.py + sweep_due_reviews (W1)

A new sweep_due_reviews activity is appended to ContinuousMonitoringWorkflow (continuous_monitoring.py:49-161) after the per-case checks — same per-tenant Temporal schedule, no new schedules. It selects baselines with next_review_due < now and an ACTIVE relationship, emits a review_due alert per due entity, and (per §5.5) auto-opens the review case for EDD, alert-first for CDD/SDD. GET /api/monitoring/reviews-due (extending the existing baselines endpoint, ordered by next_review_due) plus a dashboard ReviewsDueQueue widget (the API client at frontend/src/lib/api.ts:2407 is currently unconsumed) make the queue visible.

6. Review cases are new ComplianceCaseWorkflow instances

A review — periodic or trigger-driven — starts a fresh ComplianceCaseWorkflow with three new case columns (Alembic, W1): origin (onboarding|periodic_review|trigger, default onboarding), review_of_case_id (lineage to the approved case), trigger_alert_id (back-ref to the routing alert). Scope follows the response: full_kyc_refresh runs the whole 12-step loop including the portal/document phase; targeted_update runs investigation-only and skips AWAITING_DOCUMENTS unless gap analysis demands documents. The loop closes itself: upsert_entity_baseline already recomputes next_review_due on workflow completion (compliance_case.py:1053), so a completed review re-arms the next cycle with zero extra machinery. POST /api/cases/{id}/review lets an officer open one manually. Settled question (architecture doc §5.2, do not re-litigate): review = new case instance, not a resumed workflow — see Alternative 2.

7. Non-continuous relationship classification (W1)

entity_baselines gains relationship_activity_class (continuous|non_continuous, default continuous) plus activity_class_factors JSONB documenting the risk-factor basis (the AMLA draft guidelines permit lighter review for genuinely non-continuous relationships only with a documented classification). The class affects review depth, never the Art. 26(2) ceilings.

Cross-cutting: every monitoring_alerts / cases / entity_baselines INSERT sets tenant_id explicitly (RLS WITH CHECK, the PR #177 lesson); every routing decision and case spawn writes an immutable audit_events row (ADR-0064); asyncpg JSONB params use CAST(:p AS jsonb); the OB Holding fixture is the standard trigger-test subject — a synthetic sanctions-list delta on its UBO must produce alert → route → review case end-to-end under testcontainers + Temporal time-skipping.

Consequences

Positive

  • Art. 26(3) triggers become detection code with typed routing, and Art. 26(2) periodic reviews actually fire — the two Guideline-1 blockers ("passive confirmation", "triggers as config constants") close, and KBC gap #8 (perpetual KYC) gets its engine.
  • monitoring_alerts gains its first writer, un-orphaning the existing read surfaces and dashboard badge (gap doc §3.2.2).
  • One cadence vocabulary ends the memo-vs-baseline contradiction (§3.2.4): the customer-facing review date and the machine-enforced review date are computed from the same table.
  • Full workflow reuse: review cases inherit every existing gate (four-eyes ADR-0070, SAR-first ADR-0071, UBO discrepancy ADR-0059) and the baseline self-re-arms at compliance_case.py:1053 with no new state machinery.
  • detected_at/routed_at/review_opened_at gives auditable "without undue delay" evidence per trigger.

Negative

  • Two taxonomies now coexist (MonitoringCheckType and MonitoringTriggerType) with a many-to-many mapping maintained in trigger_router_service; a new check added without a router mapping is a silent-drop hazard — mitigated by a fail-closed router default (unmapped detection ⇒ record_only alert + WARNING, never nothing) and a test_no_false_reassurance.py invariant, but the hazard is structural.
  • EDD auto-open creates officer workload without officer intent: a noisy detector (e.g. adverse-media churn) can spawn review cases in bulk. The floor is deliberately non-disableable, so the only relief valves are detector precision and CDD/SDD alert-first routing — tenants with many EDD relationships will feel this first.
  • The memo's promised next-review date changes for existing tenants (HIGH 6→12 months under EDD, MEDIUM 12→24 under CDD): previously issued memos state dates the unified engine will not honour. Accepted: the old values were never enforced by anything, and the new ones are the ones the baseline actually stores; W6's framework record documents the calibration basis.
  • Byte-identical enum values freeze the five original trigger names (including the unwieldy ownership_change_above_25pct) into the canonical vocabulary forever, because tenant config keys reference them.
  • Review cases are real cases rows: entity history multiplies rows per relationship (dashboard filters must default to origin='onboarding'-plus-open-reviews or officers drown in lineage).

Neutral

  • CDD/SDD review depth and the alert-disposition lifecycle are deliberately deferred to ADR-0084/W2; until W2 ships, routed alerts sit in the queue with only the pre-existing acknowledge (made rationale-bearing and audited in W0).
  • Tenant override of routing is upward-only; there is no per-tenant "less monitoring" dial by design.
  • The cdd_nonresponse, document_expired, profile_deviation, verification_stale members are declared now (one shared vocabulary, architecture doc §1.3) but gain detectors only in W3–W5.

Alternatives Considered

Alternative 1: extend MonitoringCheckType instead of a separate trigger enum

Add the Art. 26(3) event names as new check types and route on check type + severity. Rejected: checks and triggers are different axes with a many-to-many relationship — detect_material_changes (one check) already surfaces status, sanctions, PEP and adverse-media changes in a single event (monitoring_check_service.py:884-948), and pep_status_change is fed by both the re-screen and the material diff. Folding triggers into MonitoringCheckType would force either duplicate check runs per trigger or severity-string parsing to recover the trigger, and would break the byte-compat contract with MANDATORY_EVENT_TRIGGERS config keys, which name events, not check mechanics. It would also make TriggerResponse routing conditional on check internals rather than a typed input.

Alternative 2: resume the original Temporal workflow instead of spawning a review case

Keep the onboarding ComplianceCaseWorkflow alive after approval (or signal-revive it) and loop it back to AWAITING_DOCUMENTS on a trigger. Rejected: the workflow completes at terminal status by design and the decision endpoint 409s outside REVIEW_PENDING (gap doc §3.1 "an APPROVED relationship is frozen forever" — that terminality is load-bearing for the audit story); keeping workflows open for the multi-year life of a relationship means unbounded Temporal histories, continue-as-new plumbing, and versioning exposure across every future workflow change, precisely what ADR-0070 avoided by keeping gates out of workflow history. A fresh instance re-runs the current 12-step pipeline (current detectors, current gates, current calibration) rather than replaying a years-old workflow definition, and lineage is two columns. This is settled as architecture doc §5.1–5.2.

Alternative 3: one Temporal cron/schedule per relationship

Give every approved relationship its own schedule firing at its next_review_due. Rejected: the schedule count becomes O(relationships) per tenant (Temporal schedules are a control-plane resource, not a data-plane row), every cadence recalculation (risk-tier change, config change) becomes a schedule mutation with drift risk between DB truth and Temporal state, and the existing architecture already has exactly one idempotent per-tenant schedule (monitoring-{tenant_id}, monitoring_schedule_service.py:113-118) whose workflow can sweep a DB predicate (next_review_due < now) in one query. The sweep keeps the database as the single source of truth for due-ness.

Alternative 4: do nothing (ship detection improvements only)

Keep improving detectors without routing or periodic reviews. Not viable: the gap register's headline finding is that detection is already production-grade and the response layer is the blocker — AMLA's own effectiveness framing is "what happens after an alert is generated". Under comply-or-explain from Q1 2027, a tenant cannot demonstrate Art. 26(2)/(3) compliance with an events list and a boolean acknowledge, and the inert next_review_due would remain the exact "passive confirmation" the guidelines reject.