Skip to main content

ADR-0050: Enforce RLS by running the app as a non-superuser role

Status: Accepted Date: 2026-06-12 Supersedes: none Superseded by: none

Context

ADR-0023 introduced PostgreSQL Row-Level Security with FORCE ROW LEVEL SECURITY on 22+ tenant-scoped tables, and the policies were written and migrated. But the 2026-06-11 code review found RLS was inert in production: the application connected to Postgres as the superuser temporal, and a superuser bypasses RLS entirely (even FORCE ROW LEVEL SECURITY does not apply to superusers or table owners). Tenant isolation existed on paper but was not enforced at runtime — any query could read or write any tenant's rows. This is a critical finding for a multi-tenant KYB/KYC platform with EU AI Act / GDPR / AMLR obligations.

Flipping the connection role is not a one-line change because:

  1. Alembic needs DDL privileges the app role must not have. A single database_url cannot serve both runtime (DML-only, RLS-enforced) and migrations (DDL, owner).
  2. The tenant GUC (app.current_tenant) must be session-scoped and reset on connection return, or a committed value leaks across the connection pool (pool_reset_on_return="rollback" does not clear a set_config(..., false) GUC).
  3. Several latent code paths only "worked" because the superuser bypassed RLS (see ADR-0051).

Decision

Run the application as a dedicated non-superuser role trustrelay_app (NOSUPERUSER, DML grants only) so FORCE ROW LEVEL SECURITY actually applies, and split the database URLs by privilege:

  • database_urltrustrelay_app — application runtime; RLS enforced.
  • migration_database_url → superuser — Alembic only (DDL).

Supporting decisions:

  • Role provisioning lives in scripts/create_app_role.sql, mounted in docker-compose as 04-create-app-role.sql.
  • Corrective migration 060 fixes six policies whose GUC name was wrong (app.tenant_idapp.current_tenant) and adds missing_ok to current_setting(...) so an unset GUC yields NULL rather than erroring.
  • The tenant GUC is set session-scoped (set_config(name, val, false)) and explicitly reset on connection close; app.rls_bypass is an admin-only escape hatch consumed by get_admin_session().
  • Session factories: get_session() (demo tenant), get_tenant_session(tid) (scoped to tid), get_admin_session() (sets app.rls_bypass, for legitimately cross-tenant/system access).

Consequences

  • Tenant isolation is enforced at the database layer, not just by application predicates — defense in depth that survives application bugs.
  • Enforcing the constraint surfaced latent code that had silently relied on the superuser bypass: cross-tenant write paths and a feature whose SQL had never actually executed (see ADR-0051). Every RLS rejection the flip produced was a real isolation gap being closed.
  • Deploy requirement: the trustrelay_app role must exist and Alembic must use MIGRATION_DATABASE_URL. A docker-compose up smoke-test of the non-superuser path is a standing verification item.
  • A demo-only test suite cannot catch wrong-tenant behaviour; multi-tenant RLS regression tests (test_rls_enforcement.py, test_rls_guc_persistence.py, test_rls_policy_correctness.py) seed a non-demo tenant and assert isolation.