Skip to main content

ADR-0053: UBO ownership computation engine (multi-path summation)

Status: Accepted Date: 2026-06-15 Supersedes: none Superseded by: none Deciders: Adrian (Soft4U), Claude Opus 4.8

Decision context:

  • Latency: one extra Neo4j read (two _run_query calls in fetch_ownership_graph) plus, on determination paths, one Postgres insert. DFS is O(simple paths) on a sparse, shallow KYB graph (≤10 hops); not measured precisely, expected sub-100 ms because real ownership graphs are sparse and the engine caps depth/path count. get_amlr_coverage calls the engine with persist=False, so the coverage read path adds the Neo4j read but no DB write.
  • Dependency surface: no new third-party packages. The pure engine depends only on trustrelay-models + stdlib + structlog. One new Postgres table (ubo_computations) + one ORM model + one Alembic migration.
  • Debuggability: the regulated arithmetic lives in one pure, import-isolated module (ubo_engine.py) that is exhaustively unit-testable without Neo4j or a DB. Every determination carries a PathTrace (the contributing paths + per-edge percentages + product), so a result is self-explaining; failures surface as plain Python stack traces, not Cypher.
  • Reversibility: revert is a branch revert plus alembic downgrade -1 (drops ubo_computations). The rewired get_amlr_coverage §2(c) block is a localized change; get_ownership_tree (network UI) is untouched.
  • Blast radius: additive. New files (engine, models, service, migration, tests) plus one rewired §2(c) block. No existing computation path is changed except the UBO flag derivation.
  • Alternative considered: computing the summation in Cypher (rejected — the threshold logic, reason codes, and PathTrace assembly would live in opaque, unversionable, hard-to-test query strings).

Context

UBO (Ultimate Beneficial Owner) determination did not compute — it counted pre-declared register edges. In graph_service.py, get_amlr_coverage set ubo_identified = ubo_count > 0, where ubo_count was the number of HAS_UBO edges ingested from registries, and the companion flag ownership_chain_computed echoed the same boolean. The only path arithmetic, get_ownership_tree, multiplied fractions along a single path and then discarded alternate paths via a global node-dedup (seen_parents/seen_subs), and its output drove the network-graph UI, not UBO flagging. The graph had no person→intermediate-entity ownership edges — natural persons attached only to the subject via HAS_UBO/HAS_DIRECTOR, so indirect ownership through intermediate companies was unrepresentable.

This violates Regulation (EU) 2024/1624 (AMLR) Art. 51–53, which require beneficial ownership to be determined by aggregating indirect holdings across the ownership structure. The canonical failure: a person owning the subject through two disjoint chains of 15% each (summed 30% ≥ 25%) is a UBO, but no single path reaches the threshold — the count-based flag and single-path arithmetic both miss it.

A further constraint surfaced during design: the 25% threshold is the AMLR/EU baseline only. Some jurisdictions and entity types apply a lower figure (e.g. 10% for higher-risk sectors), and this is a multi-country platform (ADR-0034, ADR-0031). A hardcoded 25% would silently produce wrong determinations and false audit labels for non-25% jurisdictions.

Decision

Introduce a UBO computation engine structured as a pure computation core + a thin Neo4j-read graph adapter + an append-only Postgres audit record:

  1. Domain models (trustrelay-models/ubo.py): OwnershipEdge, OwnershipGraph, PathEdge, PathTrace, BeneficialOwnerResult (carrying aggregated_pct, qualified, qualified_via, reason_code, threshold_pct, path_traces, truncated).
  2. Pure engine (backend/app/services/ubo_engine.py): UBOComputationEngine(ubo_threshold=0.25). For each natural person it DFS-enumerates every simple path to the subject using a per-path visited-set — cycle-safe and alternate-path-preserving (the fix for the global-dedup bug) — multiplies edge fractions along each path, sums the products across distinct paths, and flags a UBO when the summed fraction ≥ the configured threshold (minus a 1e-9 float-comparison epsilon). Depth and path-count caps set a truncated flag (no silent truncation). No Neo4j/DB/config imports.
  3. Threshold is injected, not constant. ubo_threshold defaults to the AMLR 0.25 baseline but is passed per call; the applied value is recorded on every BeneficialOwnerResult (threshold_pct) and in the persisted row, and reason_code is derived dynamically (ownership_25, ownership_10, …) so the audit label is always truthful. Per-jurisdiction threshold resolution plugs into the regulatory-segment machinery (ADR-0031); the full per-country table is a tracked follow-up.
  4. Graph adapter (GraphService.fetch_ownership_graph): a Neo4j-read-only method that builds the in-memory OwnershipGraph from IS_SUBSIDIARY_OF (entity→entity) and HAS_UBO (person→company) edges. GraphService stays DB-free.
  5. Orchestration + persistence (UBOComputationService): fetch → compute → persist, mirroring the _session_scope pattern in decision_memorandum_service.py. A persist flag lets coverage reads compute without writing an audit row.
  6. Append-only audit table (ubo_computations, Alembic 063): one row per deliberate computation run, FORCE ROW LEVEL SECURITY (ADR-0023), storing the full results + PathTrace + applied threshold — the regulator-facing, queryable record of how and under which rule each determination was reached (EU AI Act Art. 12), and the natural seam for issue #41 (DB-enforced audit immutability).
  7. Flag rewiring: get_amlr_coverage §2(c) consumes the service (with persist=False), so ubo_identified reflects qualified UBOs and ownership_chain_computed is true only when the computation actually traced a natural person through the structure (honest coverage — no over-stating on an empty graph).

Consequences

Positive

  • UBO determination becomes a real AMLR Art. 51–53 computation; the 15%+15%=30% case is correctly flagged.
  • The regulated decision logic is isolated in one pure, exhaustively testable module — high assurance.
  • Every determination carries a full path-level audit trail (PathTrace) plus the applied threshold, satisfying EU AI Act Art. 12 traceability and the project's "fully traceable, retrievable, auditable" principle.
  • Multi-country correctness: thresholds are jurisdiction-configurable and audit labels stay truthful.
  • Establishes the engine that ADR follow-ups / issues #31 (control dimension) and #32 (SMO fallback) extend.

Negative

  • Adds a Postgres table → an Alembic migration + RLS policy + ORM model (bounded but real).
  • Simple-path enumeration is exponential in dense graphs; mitigated by depth/path caps (acceptable for sparse KYB graphs, and bounded loudly via the truncated flag when not).
  • The engine is correct but data-starved until registry ETL populates person→intermediate ownership edges (a deliberate out-of-scope follow-up); until then most cases yield only direct-ownership qualification.
  • get_amlr_coverage gains a Neo4j read (the ownership-graph fetch) on top of its existing queries.

Neutral

  • The existing single-path get_ownership_tree is untouched — the network UI depends on it; the engine is additive, not a replacement.
  • The ubo_threshold injection seam exists, but per-jurisdiction resolution is deferred to the regulatory-segment work; production currently applies the AMLR default.

Alternatives Considered

Alternative 1: Fix get_ownership_tree in place (remove the global dedup, sum paths there)

  • Remove the seen_parents/seen_subs node-dedup and sum across paths within the existing method.
  • Why rejected: that method is shaped for hierarchical UI rendering (parents/subsidiaries/depth) and runs inside Neo4j-bound code that cannot be unit-tested without a driver. Folding the regulated decision logic into it would make the AMLR arithmetic untestable in isolation and entangle two concerns (presentation and determination).

Alternative 2: Compute the summation in a Cypher query

  • Use Neo4j variable-length path matching + aggregation to compute summed ownership in the database.
  • Why rejected: the cross-path summation, threshold logic, reason codes, and PathTrace assembly would live in opaque query strings — hard to unit-test, impossible to version as "engine logic" for EU AI Act Art. 12, and difficult to make jurisdiction-configurable. Keeping the algorithm in typed Python is the auditable choice; Neo4j remains the data source, not the decision-maker.

Alternative 3: Persist computed results as Neo4j nodes instead of a Postgres table

  • Write a :BeneficialOwnerResult node + QUALIFIES_AS_UBO edge with PathTrace JSON.
  • Why rejected: not easily audit-exportable, has no append-only guarantee, and does not integrate with the planned DB-enforced audit-immutability work (#41). A tenant-scoped, append-only Postgres table is the regulator-facing record of choice.