Skip to main content

ADR-0085: Post-approval relationship lifecycle — AMLR Art. 21 suspension/restriction/offboarding and the CDD-refresh outreach ladder

Date: 2026-07-03 Status: Proposed Deciders: Adrian (Soft4U), Claude Fable 5 (AMLA remediation design session)

Decision context:

  • Latency: no change to the onboarding hot path — the decision endpoint and the Temporal onboarding workflow are untouched. New cost is one indexed read of cases.relationship_status in the monitoring population query (activities.py:3504-3510) and, per lifecycle action, one transition insert + one audit_events insert + (for suspend/offboard) the existing maker-checker round-trip. Timer/reminder sweeps ride the already-scheduled ContinuousMonitoringWorkflow — no new Temporal schedules.
  • Dependency surface: no new packages. One enum in trustrelay-models, two new RLS tables (one Alembic migration, next sequential revision at implementation time), two new service files, endpoint wiring, two new Permission members. Reuses MakerCheckerService (ADR-0070), the portal-token mint pattern (case_crud.py:61, :1373), the SLA policy windows (sla_service.py), and customer_contact_gate (ADR-0071).
  • Debuggability: every transition is one relationship_transitions row (from/to, Art. 21 safeguards JSONB, rationale, maker/checker, review_due_at) plus one immutable audit_events row (ADR-0064) — a regulator can replay the relationship's entire post-approval history from two append-only spines. CDD-refresh state (reminders sent, deadline, fulfilment) is a first-class row, not log archaeology.
  • Reversibility: additive — new column, new tables, new endpoints. Rolling back leaves relationship_status NULL-tolerated everywhere (NULL on a terminal-approved case reads as ACTIVE for population purposes during backfill); no CaseStatus member is added or removed, so no workflow-history or dashboard state-machine regression. Not measured as a live rollback because nothing ships behind this ADR yet (Proposed, wave W3).
  • Blast radius: the onboarding state machine is untouched by design (§5.1 of the architecture doc). Surfaces that change: monitoring population filter (must exclude OFFBOARDED), dashboard case list/detail (new relationship badge + actions), portal (post-approval fulfilment path), SLA overdue predicate (must NOT treat SUSPENDED as clock-stopping). A bug here cannot corrupt an in-flight onboarding workflow; the worst case is a wrong relationship badge or a missed sweep, both recoverable from the transitions table.
  • Alternative considered: extend CaseStatus with post-approval states; a per-relationship long-running Temporal workflow; offboard-only without intermediary states — see Alternatives Considered.

Context

AMLR (EU) 2024/1624 Article 21 obliges an institution that cannot comply with CDD to refuse or terminate the business relationship; the AMLA draft ongoing-monitoring guidelines (2026, public hearing 2026-07-02) elaborate a temporary suspension/restriction step before termination, guarded by a documented safeguard assessment (risk level, mitigation effectiveness, sufficiency of the customer file) and an explicit temporariness requirement — a suspension must carry a review date and may not become a silent parking state. AMLR Art. 26 makes the relationship a living object after approval; AMLD Art. 39 forbids any customer-facing surface from revealing that suspicion is the reason for an outreach or a service limitation.

Trust Relay today is structurally incapable of any of this. The gap analysis (docs/research/2026-07-03-amla-ongoing-monitoring-gap-analysis.md §3.1 "Suspension/offboarding", evidence audited to file:line) found:

  • An approved relationship is frozen forever. CaseStatus.terminal_statuses() (packages/trustrelay-models/src/trustrelay_models/case.py:31) deliberately includes APPROVED and APPROVED_WITH_RESTRICTIONS; the Temporal onboarding workflow completes at that point and the decision endpoint 409-rejects any decision on a case not in REVIEW_PENDING (backend/app/api/case_decisions.py:75-91). There is no state for suspended, restricted, or offboarded, and no endpoint that could set one.
  • The only unresponsiveness handling is an onboarding-time 60-day cliff. If the customer never submits documents, compliance_case.py:1316-1324 times out the wait_condition and hard-fails the case to FAILED — there is no reminder, no escalation ladder, and nothing at all post-approval.
  • The reminder knob is dead. settings.reminder_interval_days = 7 (backend/app/config.py:319) has zero consumers (gap doc §3.2.5).
  • Post-approval outreach has no channel. Portal tokens expire (backend/app/api/portal.py:267-302 returns 410 Gone past portal_token_expires_at); the mint-with-TTL pattern exists (case_crud.py:61 _generate_portal_token, :1373 portal_token_expires_at = now + settings.portal_token_ttl_days) but nothing re-issues a token after approval, so an approved customer cannot be asked for anything.
  • The SLA clock stops exactly where AMLA says it must not. is_overdue (backend/app/services/sla_service.py:108-127) returns False for any terminal CaseStatus — correct for onboarding, but it means any timer attached to an approved case is inert. A suspension parked on an APPROVED case would never age.
  • The restrictions catalogue exists but has no lifecycle. MerchantRestrictions (case.py:79-92: blocked_mcc[], max_ticket_eur, max_monthly_volume_eur, requires_secondary_review, mandatory restriction_reason, evidence_refs) is captured at decision time and then never reviewed, never expires, and has no relationship-state counterpart — the unreviewed half of KBC gap #4 (conditional approval, memory/roadmap-acquiring-gaps.md).

The canonical vocabulary and wave plan for the fix are fixed in docs/superpowers/specs/2026-07-03-amla-remediation-architecture.md (§2.1, §2.3, §2.4, §2.5, wave W3). §5.1 of that document settles, and this ADR does not re-litigate, that the relationship lives on the cases row — the approved case is the relationship record, and the monitoring population already selects approved cases (backend/app/workflows/activities.py:3504-3510, Case.status.in_(["APPROVED", "APPROVED_WITH_RESTRICTIONS"])).

Decision

Implement the post-approval relationship lifecycle as architecture-doc wave W3, using the §2 vocabulary verbatim.

1. RelationshipStatus enum on the case row (not a new entity). Add to packages/trustrelay-models/src/trustrelay_models/case.py exactly as §2.1 defines it: ACTIVE / UNDER_REVIEW / SUSPENDED / RESTRICTED / OFFBOARDED. Persist as cases.relationship_status (nullable enum-string; Alembic, next sequential revision — no hardcoded revision id). The workflow's decision handler sets ACTIVE when a case enters APPROVED or APPROVED_WITH_RESTRICTIONS (RESTRICTED when a typed restrictions payload is active); CaseStatus itself is not extended — the onboarding machine stays terminal by design. NULL (pre-backfill approved cases) is treated as ACTIVE by readers so the fix never silently drops an existing relationship from monitoring.

2. relationship_transitions table (append-only, RLS). Per §2.3: id, tenant_id (set explicitly on every INSERT — never via server_default, the PR #177 lesson), case_id, from_status, to_status, reason (enum-string), safeguards JSONB carrying the Art. 21 safeguard checklistrisk_level, mitigation_effectiveness, file_sufficiencyrationale text NOT NULL, review_due_at (temporariness), maker_id, checker_id, created_at. One row per suspend/restrict/reinstate/offboard; every transition additionally emits an immutable audit_events row (ADR-0064). A suspend/restrict request whose safeguards JSONB is missing any of the three checklist keys is rejected 422 — the safeguard assessment is the point, not decoration. Any JSONB bind parameters use CAST(:param AS jsonb), never ::jsonb.

3. relationship_lifecycle_service.py (new, backend/app/services/) owns the transition machine:

  • Legal transitions: ACTIVE|UNDER_REVIEW → SUSPENDED|RESTRICTED, SUSPENDED|RESTRICTED → ACTIVE (reinstate), any-non-terminal → OFFBOARDED. OFFBOARDED is terminal; transitions out of it are rejected 409.
  • Maker-checker reuse (ADR-0070): suspend and offboard route through MakerCheckerService — the maker records the intended transition with safeguards + rationale; a different checker (assert_distinct_approver, backend/app/services/maker_checker.py:124) authorizes before the transition row is written and the status flips. restrict and reinstate are single-officer but fully audited. Permissions per §2.5: RELATIONSHIP_SUSPEND and RELATIONSHIP_OFFBOARD are mlro-level members of the ADR-0074 Permission registry.
  • Temporariness timer: every SUSPENDED/RESTRICTED transition requires review_due_at. A sweep_suspension_timers activity added to ContinuousMonitoringWorkflow (§2.6 — same schedule, new activity) raises a review_due alert when it elapses; a suspension can therefore never rot silently. Because sla_service.is_overdue (sla_service.py:108-127) stops the clock on terminal CaseStatus, the overdue machinery is extended with a relationship-aware predicate: a case whose relationship_status ∈ {SUSPENDED, RESTRICTED} keeps its relationship clock running regardless of the (terminal) CaseStatus. The onboarding SLA semantics for non-approved cases are unchanged.
  • Offboarding exclusion: the monitoring population query (activities.py:3504-3510) gains relationship_status IS DISTINCT FROM 'OFFBOARDED' — offboarded relationships leave the monitoring population; their history remains fully retained (AMLR 5-year retention, ADR-0069 case pack still exports everything).

4. cdd_refresh_requests table + cdd_refresh_service.py — the outreach ladder. Per §2.3/§2.4: id, tenant_id (explicit), case_id, reason (MonitoringTriggerType), requested_items JSONB, portal_token, deadline_at, reminders_sent int, last_reminder_at, status (open|fulfilled|expired|cancelled), created_at. The service:

  • creates a request and re-issues a fresh portal token with TTL using the case_crud.py:61/:1373 mint pattern (the expired-token 410 at portal.py:267-302 stays authoritative for stale tokens);
  • runs a sweep_cdd_reminders activity on ContinuousMonitoringWorkflow, finally giving settings.reminder_interval_days (config.py:319) its first consumer;
  • on deadline_at without fulfilment, marks the request expired and emits a cdd_nonresponse MonitoringTriggerType alert (ADR-0083 taxonomy) carrying a suspension recommendation — routing to suspension is an officer/MLRO decision through the maker-checker gate above, never automatic (never suppress, but also never auto-punish without a human and a safeguard record);
  • portal fulfilment reuses the existing upload path and flips the request to fulfilled.

5. Tipping-off boundary (AMLD Art. 39; architecture doc §1.6; ADR-0071). All outreach content and every customer-visible suspension surface pass the customer_contact_gate predicate (backend/app/services/customer_contact_gate.py). Outreach states what is required (document list, deadline), never why; a suspended relationship renders to the customer exclusively as "service temporarily limited pending information". Reason codes, safeguard checklists, alert linkage, and rationale live only on officer/regulator surfaces. Fail-closed: if the gate cannot evaluate (e.g., SAR predicate state unavailable), outreach generation is blocked, not sent bare.

6. Typed restrictions catalogue — KBC gap #4 completion. The existing MerchantRestrictions payload (case.py:79-92) becomes the mandatory typed body of a RESTRICTED transition (catalogue: blocked_mcc[], max_ticket_eur, max_monthly_volume_eur, requires_secondary_review) and gains a review date via the transition's review_due_at — restrictions are now reviewable and expirable instead of write-once decision metadata. restriction_reason and evidence_refs remain mandatory.

7. API surface (§2.5). POST /api/cases/{id}/relationship/suspend · /restrict · /reinstate · /offboard (body = safeguards + rationale [+ restrictions payload for restrict]; maker-checker routing for suspend/offboard) and POST /api/cases/{id}/cdd-refresh plus the portal fulfilment path. All officer endpoints RBAC-gated (ADR-0074, Phase 2 active).

8. Verification (§4 of the architecture doc). Testcontainers throughout (no mocks without an approval comment); Temporal sweeps tested via WorkflowEnvironment.start_time_skipping(); test_no_false_reassurance.py (or a sibling oracle file) gains the invariants: a suspended relationship keeps its clock running; a suspend without the three safeguard keys is rejected; cdd_nonresponse never auto-offboards; no customer-facing string generated by cdd_refresh_service contains suspicion-related reason text; OFFBOARDED leaves the monitoring population while its audit trail remains exportable.

Consequences

Positive

  • Art. 21's suspend/restrict-before-terminate step exists with the exact safeguard record AMLA's draft guidelines describe (risk level, mitigation effectiveness, file sufficiency), under four-eyes, with temporariness enforced by a timer that survives the terminal CaseStatus — the "frozen forever" relationship (gap doc §3.1) is closed.
  • Post-approval CDD refresh gets a real channel: token re-issue, reminders (reviving the dead reminder_interval_days knob), deadline, and a typed non-response trigger — the onboarding-only 60-day cliff (compliance_case.py:1316-1324) is no longer the system's entire theory of customer unresponsiveness.
  • Two append-only records (relationship_transitions + audit_events) make the whole post-approval history regulator-replayable and case-pack-exportable (ADR-0069).
  • Restrictions become a governed lifecycle state with review dates (KBC gap #4 complete), not write-once decision metadata.

Negative

  • Two parallel state machines on one row. CaseStatus (onboarding) and RelationshipStatus (relationship) coexist on cases; every consumer that today infers "live relationship" from status IN (APPROVED, APPROVED_WITH_RESTRICTIONS) must learn the second axis, and a missed consumer silently ignores suspensions. Mitigated by the NULL-reads-as-ACTIVE rule and the oracle tests, but the conceptual overloading is real and permanent.
  • Suspension enforcement is representational only. Trust Relay records SUSPENDED / RESTRICTED; it does not sit in the payment path, so nothing here actually pauses a merchant's processing. Until a tenant's downstream system consumes the state (outbound webhooks deliberately deferred, architecture doc §5.6), the control is a documented instruction, not a technical block — this must be stated honestly in the Monitoring Framework Record (W6), not implied away.
  • More MLRO workload: suspend/offboard now cost two people and a safeguard checklist, and the reminder/timer sweeps add recurring activity load to every tenant's monitoring schedule.

Neutral

  • No new Temporal schedules — all sweeps ride ContinuousMonitoringWorkflow (§2.6).
  • CaseStatus, its terminal_statuses() contract, and the decision endpoint's REVIEW_PENDING 409 gate (case_decisions.py:75-91) are byte-for-byte unchanged; the onboarding dashboard state machine needs a relationship badge, not a rework.
  • W3 sits behind W0–W2 in the wave order (needs the alert spine for cdd_nonresponse and review_due routing) and is independently shippable as one PR/plan/migration.

Alternatives Considered

Alternative 1: extend CaseStatus with post-approval states

Add SUSPENDED/RESTRICTED/OFFBOARDED to the existing enum. Rejected: APPROVED is terminal by contract (case.py:31-38) and at least three load-bearing consumers depend on that — the SLA breach predicate (sla_service.py:122), the decision endpoint's 409 gate (case_decisions.py:84), and the completed Temporal workflow itself (the workflow returns at terminal; reopening would require re-signalling a finished execution or workflow.patched()-style contortions the repo forbids). Un-terminalizing approval would ripple through every "is this case done?" read and conflate the process (onboarding, finite) with the relationship (indefinite). §5.1 of the architecture doc settles this separation.

Alternative 2: a per-relationship long-running Temporal workflow

Keep one never-completing workflow per approved relationship holding the lifecycle state. Rejected: it duplicates state that must be queryable at rest (dashboards, monitoring population SQL, case-pack export read the DB, not workflow queries), creates thousands of idle executions whose histories grow with every timer tick for years (AMLR retention horizon), and makes every lifecycle rule change a workflow-versioning event. The architecture doc already routes periodic work through the single per-tenant ContinuousMonitoringWorkflow sweep (§2.6) — timers as swept rows, not as live workflow clocks, is the established pattern here.

Alternative 3: offboard-only, no intermediary states

Implement only ACTIVE → OFFBOARDED. Rejected: it fails the regulation this ADR exists for — the AMLA draft guidelines' elaboration of Art. 21 is specifically the graduated suspend/restrict step with a safeguard assessment and temporariness before termination; binary offboarding forces MLROs into either premature termination or documented inaction. It also abandons the RESTRICTED state that completes KBC gap #4 (a live commercial ask) and leaves MerchantRestrictions as unreviewable write-once metadata.

Alternative 4: do nothing

Viable only until NCAs declare comply-or-explain (Q1 2027; AMLR applies 2027-07-10). Rejected because the gap doc grades this domain absent, not partial — an obligation that is not demonstrable at all — and because two independent analyses (gap doc §4 Tier 1.4; Atlas delta themes) converged on it as a blocker.