Security Architecture
This page documents the current security posture of Trust Relay. The system has been through a dedicated security remediation phase (Phase 4) and a subsequent multi-tenancy implementation (Pillar 0) that established Keycloak 26 as the sole authentication mechanism with Row Level Security across all tenant-scoped tables.
Implemented Security Measures
Officer Authentication (Keycloak 26 + JWT/JWKS)
Officer-facing API endpoints are protected by JWT-based authentication via the get_current_user dependency in app/api/deps/auth.py. Authentication is provided exclusively by Keycloak 26.
How it works:
- Every request requires a valid
Authorization: Bearer <token>header - The JWT is validated against the Keycloak JWKS endpoint configured via
auth_jwks_url - Issuer (
iss) and audience (aud) claims are verified - JIT (Just-In-Time) user provisioning ensures database user records are created on first login via
_ensure_db_user()using theUserRepository
# JWKS key rotation: keys are cached for 5 minutes
_JWKS_TTL_SECONDS = 300
Role-Based Access Control (RBAC): Four roles are defined in Keycloak and enforced by the backend:
| Role | Permissions |
|---|---|
super_admin | Full system access, tenant management, user management |
compliance_manager | Manage officers, override automation tiers, review escalated cases |
officer | Create and investigate cases, make decisions, interact with copilot |
auditor | Read-only access to cases, audit trails, and investigation results |
Multi-Tenancy (Row Level Security)
Tenant-scoped database tables carry PostgreSQL Row Level Security (RLS) policies (ADR-0023) created across Alembic migrations with FORCE ROW LEVEL SECURITY enabled. Each table has a tenant_isolation_* policy keyed on the app.current_tenant GUC and an admin_bypass_* policy keyed on app.rls_bypass. The session layer (app/db/database.py) sets the tenant context on every connection via set_config('app.current_tenant', ..., true):
get_session()sets the demo tenant (00000000-0000-0000-0000-000000000001) as the default RLS contextget_tenant_session(tid)sets an explicit tenant context (transaction-local GUC)get_admin_session()setsapp.rls_bypass='true'for cross-tenant administrative operations
:::danger RLS is currently inert in the shipped configuration
The RLS design above is not enforced at runtime as shipped. The application connects to PostgreSQL as the bootstrap superuser temporal — this is the database_url default in app/config.py (postgresql+asyncpg://temporal:temporal@...), and the runtime .env does not override DATABASE_URL. PostgreSQL superusers bypass even FORCE ROW LEVEL SECURITY, so the tenant_isolation_* policies never take effect for the application connection.
A non-superuser application role (trustrelay_app, NOSUPERUSER) is defined in scripts/create_app_role.sql, but that script is not run as part of the default setup and DATABASE_URL is not pointed at it. Until the app connects as trustrelay_app, tenant isolation is enforced only by the application's choice of get_tenant_session(tenant_id) at the query layer — not by the database. This is a known gap to close before any multi-tenant production deployment.
:::
When the app runs as the non-superuser trustrelay_app role, these policies provide tenant isolation at the database level, not just the application level.
Portal Token Authentication with Expiry
Customer access to the upload portal is gated by a 192-bit cryptographically random token generated by secrets.token_urlsafe(24). Tokens use a pt_ prefix followed by 32 URL-safe characters.
Portal URL: https://portal.example.com/portal/pt_<32-url-safe-chars>
Security controls:
- Tokens are unique per case and stored in PostgreSQL
- Portal endpoints validate the token against the database
- Tokens cannot be guessed or enumerated (192-bit entropy, significantly stronger than the prior 64-bit UUID hex truncation)
- 30-day TTL: Tokens have an
expires_atcolumn, configurable viaportal_token_ttl_days - Expired tokens return HTTP 410 Gone
- Rate limiting applies to token validation attempts
- The
_generate_portal_token()helper is used consistently for both single-case and bulk case creation
:::info Token Entropy Upgrade (C7)
Prior to the codebase hardening sweep, portal tokens were generated as uuid.uuid4().hex[:16] — a 64-bit truncated UUID. This has been replaced with secrets.token_urlsafe(24) (192-bit entropy), meeting NIST SP 800-63B guidance for session token length. See backend/app/api/case_crud.py.
:::
Portal Row Level Security (Two-Phase Session)
Portal endpoints require a cross-tenant token lookup before RLS enforcement. A single-session approach would fail because the token-to-tenant mapping is not known before the first query. The solution is a two-phase session pattern:
- Phase 1 — Token lookup:
_get_case_by_token()uses the default session (not tenant-scoped) to find the case and determine itstenant_id. - Phase 2 — Tenant-scoped queries: All subsequent database operations use
get_tenant_session(tenant_id), which sets the PostgreSQL RLS context and enforces row-level isolation.
This pattern is applied to all four customer-facing portal endpoints: get_portal_state, upload_document, submit_documents, and validate_document — each calls _get_case_by_token() (default session) and then switches to get_tenant_session(tenant_id). Even if a token collision occurred, subsequent queries would be scoped to the correct tenant.
Note: the database-level guarantee of this pattern depends on RLS actually being enforced. As documented under Multi-Tenancy, RLS is inert while the app runs as the PostgreSQL superuser, so cross-tenant isolation here currently rests on the application correctly selecting the tenant session, not on the database rejecting out-of-tenant rows.
See backend/app/api/portal.py.
PEPPOL API Key Protection
The PEPPOL verification API key (PEPPOL_API_KEY) is a server-side secret and must never reach the browser.
Before the hardening sweep: the key was exposed as NEXT_PUBLIC_PEPPOL_API_KEY, which Next.js inlines into the client bundle at build time.
After the hardening sweep (C4):
NEXT_PUBLIC_PEPPOL_API_KEYis removed from the frontend entirely- A Next.js Route Handler at
frontend/src/app/api/peppol/verify/route.tsproxies PEPPOL verification requests server-side - The route handler reads
PEPPOL_API_KEYfrom the server environment (never exposed to the browser) - The frontend calls
/api/peppol/verify— its own Next.js backend — instead of the external PEPPOL API directly
This eliminates the API key from all browser bundles, network inspector views, and _next/static build artifacts.
PEPPOL API Key Authentication (Officer-Facing)
The PEPPOL verification service for officer-initiated lookups uses API key authentication:
- API keys stored in the
peppol_api_keysPostgreSQL table - Keys validated on every request via header:
X-API-Key - Each key has a
rate_limit_per_minutesetting (default: 100) - Keys can be deactivated via the
activeboolean flag
IP-Based Rate Limiting
A sliding-window rate limiter (app/api/deps/rate_limiter.py) protects all endpoints:
| Context | Rate Limit | Key |
|---|---|---|
| Authenticated requests | 100 requests/minute | IP address |
| Unauthenticated requests | 20 requests/minute | IP address |
| PEPPOL API (per key) | Configurable per API key | API key |
The implementation uses an in-memory sliding window. Production would use Redis or a managed API gateway for distributed rate limiting.
Dynamic CORS Configuration
CORS origins are configurable via the CORS_ORIGINS environment variable:
# Development (default)
CORS_ORIGINS=http://localhost:3001
# Production (multiple origins)
CORS_ORIGINS=https://app.trustrelay.com,https://portal.trustrelay.com
This replaces the previous hardcoded localhost:3001 configuration.
File Upload Validation
Document uploads through the portal include:
- Content-type validation against allowed MIME types
- File size limits enforced by the web server
- Files stored in MinIO with case-scoped prefixes, preventing cross-case access
- Documents are processed by Docling in a sandboxed conversion pipeline
Evidence Chain Integrity
The Belgian evidence service implements a SHA-256 hashing chain for tamper detection:
- Per-source hashing -- Each data source response (KBO, Gazette, NBB, Inhoudingsplicht) is hashed individually
- Bundle hashing -- Individual source hashes are combined into a single bundle hash
- Dual persistence -- Raw data stored in both PostgreSQL (
belgian_evidencetable) and MinIO (JSON archive)
# Per-source hash
hash = sha256(json.dumps(data, sort_keys=True).encode()).hexdigest()
# Stored as: "sha256:{hexdigest}"
# Bundle hash
bundle = sha256(json.dumps(source_hashes, sort_keys=True).encode()).hexdigest()
The PEPPOL verification service uses the same pattern with its own EvidenceService.
PII Protection in Logs
KYC pipeline services follow a strict rule: log only case identifiers and non-PII metadata. Personally identifiable information — including person_name, date of birth, and national registration numbers — must not appear in log output.
This is enforced in backend/app/services/kyc_screening.py and backend/app/services/identity_verification.py. For example, the KYC screening log line records only case_id and nationality (a non-personal attribute), never the person's name.
Regulatory basis: GDPR Article 25 (data protection by design and by default).
PII Classification & Encryption at Rest
Trust Relay classifies every PII field in the database with one of six categories (ADR-0036). Fields in the DIRECT_IDENTIFIER, FINANCIAL, and CONTACT categories are encrypted at rest using AES-256-GCM via the EncryptedText SQLAlchemy TypeDecorator. The encryption is transparent to application code -- reads and writes happen with plaintext strings.
Encrypted columns:
users.email(CONTACT)investigation_accounts.iban(FINANCIAL)investigation_accounts.account_number(FINANCIAL)
JSONB PII encryption:
investigation_persons.identification-- document numbers encrypted within JSONinvestigation_persons.phones-- phone numbers encrypted within JSONinvestigation_persons.emails-- email addresses encrypted within JSON
Search hashes: HMAC-SHA256 hashes enable equality lookups (WHERE email_hash = ?) without decrypting every row. The pepper is a separate secret from the encryption key.
GDPR Data Subject Requests: Three API endpoints at /api/data-subject/ handle access (Art. 15), erasure (Art. 17), and rectification (Art. 16) with AML 5-year retention rules. See PII Classification & Encryption for full details.
Mock Mode Flags — Production Safety
Every external integration has a corresponding *_mock_mode boolean flag in app/config.py. In production, all flags default to False. If a flag is False and the real service is unavailable, the service raises NotImplementedError — preventing silent mock-in-production behavior.
| Flag | Service | PoC default |
|---|---|---|
osint_mock_mode | OSINT investigation pipeline | false |
kyc_mock_mode | Identity verification + KYC screening | false |
peppol_mock_mode | PEPPOL/inhoudingsplicht API | false |
kbo_mock_mode | Belgian KBO registry | false |
nbb_mock_mode | National Bank of Belgium | false |
gazette_mock_mode | Belgian Official Gazette | false |
crawl4ai_mock_mode | Web scraping | false |
graph_mock_mode | Neo4j knowledge graph | false |
letta_mock_mode | Letta compliance memory | false |
brightdata_mock_mode | BrightData scraping | false |
northdata_mock_mode | NorthData enrichment | false |
mcc_mock_mode | MCC classification | false |
Set a flag to true in .env or CI to enable offline testing. Never rely on the absence of a live service to activate mock mode silently.
SSRF Protection
Website URL input is validated before scraping:
- URL scheme validation (must be HTTP/HTTPS)
- Domain parsing via
urllib.parse - Scraping runs through crawl4ai which has its own URL validation
Production Security Enhancements
The following enhancements are planned for production hardening:
PII Classification
Status: Implemented (ADR-0036). The system now classifies 20 PII fields across 5 tables with a 6-category taxonomy. DIRECT_IDENTIFIER, FINANCIAL, and CONTACT fields are encrypted at rest using AES-256-GCM. Three GDPR DSR endpoints handle access, erasure, and rectification with AML retention rules. See PII Classification & Encryption for the full architecture.
Enhanced Audit Trail
Authentication is implemented (JWT with JWKS). The next phase will extend audit logging to include:
- Per-request audit logging with caller identity
- Document download events
- Configuration change tracking
Secret Management
API keys are stored in .env files for local development. Production deployment will use a managed secret store (AWS Secrets Manager, HashiCorp Vault, or Kubernetes secrets).
Production Security Roadmap
| Phase | Items | Status |
|---|---|---|
| Phase 1 (Pre-production) | Officer authentication (OIDC), token expiry, CORS configuration | Done (Phase 4 remediation) |
| Phase 2 (Launch) | Rate limiting, audit logging with identity | Partially done (rate limiting complete, audit logging pending full auth integration) |
| Phase 3 (Post-launch) | PII classification, data encryption at rest, GDPR compliance, secret management | PII + encryption + GDPR DSR done (ADR-0036). Secret management planned |
| Phase 4 (Maturity) | SOC 2 Type II preparation, penetration testing, security monitoring | Planned |
Security Configuration
All security-related configuration is managed via environment variables through pydantic-settings:
| Variable | Purpose | Default |
|---|---|---|
AUTH_JWKS_URL | JWKS endpoint for Keycloak JWT validation | (Keycloak default) |
AUTH_ISSUER | Expected JWT issuer claim | (Keycloak realm URL) |
AUTH_AUDIENCE | Expected JWT audience claim | (Keycloak client ID) |
PORTAL_TOKEN_TTL_DAYS | Portal token expiry period | 30 |
CORS_ORIGINS | Comma-separated allowed CORS origins | http://localhost:3001 |
PEPPOL_API_KEY | Server-side PEPPOL API key (never NEXT_PUBLIC_) | (empty) |
PEPPOL_MOCK_MODE | Disables real PEPPOL API calls | false |
KYC_MOCK_MODE | Disables real identity verification and KYC screening | false |
BRIGHTDATA_API_TOKEN | BrightData MCP authentication | (empty) |
TAVILY_API_KEY | Tavily search API authentication | (empty) |
OPENAI_API_KEY | LLM API authentication | (empty) |
MINIO_ACCESS_KEY / MINIO_SECRET_KEY | Object storage credentials | minioadmin |
DATABASE_URL | PostgreSQL connection string | (local default) |
Development-only endpoints (/api/test/mock-modes) for runtime mock mode toggling are available in the development environment. These endpoints are disabled in production deployment.