Compliance Utility

Version: 1.0 Last Updated: May 2026 Reference Implementation: Conflict Management Platform (legal-tech compliance and conflict-of-interest detection)


Component Overview

The Compliance utility is a domain-specific application built on the DataFab core framework that combines six capabilities into a single operational surface for legal and regulatory compliance teams: graph-based conflict-of-interest detection, schema management for OSINT-driven entity research, BPM-defined compliance processes, rules-driven risk-level assessment, adverse media checks, and screening against built-in and tenant-specific watchlists. Every capability is tenant-scoped, every decision is backed by a chain of audit records, and every external service consumed by the utility (the screening service, the OSINT enrichment service, the schema service) is reachable through declared, governed integration points.

The utility is what compliance teams use to onboard a new client, screen a new case, detect conflicts of interest, run risk assessment, trigger the right downstream process, and maintain a defensible audit trail.

Capabilities:

Capability Description
Graph Conflict Detection Graph-Database-backed graph identifies paths between entities at configurable depth; conflicts are persisted with full path data, parent context, and reviewer status
OSINT Schema Management Tenant-defined schemas govern the shape of client and entity data ingested via OSINT and external sources, with field-path flattening for downstream rule binding
BPM-Defined Compliance Processes Tenant-specific BPMN XML processes stored centrally and bound to risk thresholds; the right process is triggered automatically by the assessed risk level
Risk Rules Attribute-path-based conditional rules (string / number / logical operators) produce per-rule weights, summed and capped at 100 to yield a final risk score
Risk Thresholds Per-tenant LOW / MEDIUM / HIGH / VERY_HIGH bands map the score to a risk level and to the BPM process to trigger
Adverse Media Checks Case-party entities screened against adverse-media indices via the Screening Service with confidence-scored hits
Watchlist Screening Built-in watchlist catalogue plus tenant-selected subsets per check; case-party hits persisted with reviewer-status workflow
Tenant-Specific Screening Lists Tenants select which watchlists to apply per case or per onboarding event; scoping prevents cross-tenant leakage
Onboarding Enrichment Combined enrichment of people and case parties at onboarding, with optional screening and conflict detection
Audit & Review Reviewer status (relevant / irrelevant), notes, reviewer ID and timestamp on every conflict and screening hit

Connection to the DataFab Platform Value Proposition

The Compliance utility is the integration point at which several DataFab core primitives meet a single operational goal: deciding whether the platform can act for a client, on a case, on a matter — without conflict and within compliance.

DataFab Primitive Role in the Compliance Utility
Knowledge Fabric — Knowledge Graph & Entity Resolution (03) The graph that conflict detection traverses; entity-resolved Persons, Companies, Cases
Knowledge Fabric — OSINT Integration (03) OSINT search service used to enrich client and counterparty entities
Schema Management (09) Tenant schemas that define the shape of every artefact (clients, rules, BPM, compliance checks)
Studio — DDA Pipelines & Chain of Agents (04) Async enrichment, screening, and compliance-check pipelines
Studio — Dialog Playbooks & Self-Awareness (04) Compliance-officer dialog over the catalog, rules, decisions, and traces
Graph Operations — Rule Engine (08) Risk rules evaluating client attributes against conditions to assign risk weights
Graph Operations — Graph Workflows / BPM (08) BPMN processes bound to risk thresholds and triggered by compliance-check outcomes
Graph RAG — Text-to-Cypher Rules (13) Optional graph-querying inside conflict and linked-party rules
AI/LLM Layer (05) Provider-agnostic LLM access for OSINT entity extraction

Architecture Overview

┌──────────────────────────────────────────────────────────────────────┐
│                COMPLIANCE UTILITY                                    │
├──────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    EXTERNAL SOURCES                            │  │
│  │  Customer / Onboarding Systems │ OSINT │ Watchlists            │  │
│  │  Adverse Media Indices │ Bank / Org Reference Data             │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    PLATFORM INTEGRATIONS                       │  │
│  │  Essential Knowledge Fabric API │ Organization Management API  │  │
│  │  Screening Service │ LLM Router │ OSINT Search Service          │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    SCHEMA REGISTRY                             │  │
│  │  Tenant-defined OSINT schemas; flattened field paths;          │  │
│  │  bound to clients, rules, BPM, compliance checks               │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    KNOWLEDGE GRAPH                             │  │
│  │  Person │ Company │ Case nodes │ Dynamic relationships         │  │
│  │  Per-tenant graph; configurable conflict depth (default 3)     │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    ASYNC PIPELINES                             │  │
│  │  search_entity_data │ company_enrichment │ case_enrichment     │  │
│  │  onboarding_enrichment │ compliance_check │ weekly_report      │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    RULES + RISK SCORING                        │  │
│  │  Attribute-path conditions │ Per-rule weights (0–100)          │  │
│  │  Sum capped at 100 → Final risk score                          │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    RISK THRESHOLDS                             │  │
│  │  LOW / MEDIUM / HIGH / VERY_HIGH bands per tenant              │  │
│  │  Each band binds to a Business Process (BPMN XML)              │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    BUSINESS PROCESS TRIGGER (BPM)              │  │
│  │  Tenant BPMN XML executes the right operational workflow       │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    SCREENING & CONFLICT REVIEW                 │  │
│  │  Case-party adverse-media + watchlist hits                     │  │
│  │  Conflict paths from the graph; reviewer status workflow       │  │
│  └────────────────────────────────────────────────────────────────┘  │
│                              │                                       │
│  ┌────────────────────────────────────────────────────────────────┐  │
│  │                    COMPLIANCE OFFICER DIALOG (Playbook)        │  │
│  │  list_clients │ explain_decision │ rule_deep_dive │ hits       │  │
│  │  conflicts │ thresholds │ schema introspection                 │  │
│  └────────────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────┘

Domain and Schemas

The utility’s domain is centred on Clients (people or companies the tenant transacts with), Cases (matters with parties), and the graph relationships that connect them.

Entity Description
Client A person or company associated with the tenant; carries a schema-bound JSON data record
Case A matter with parties; triggers screening and conflict detection
Party Person or company on a case (case_screening targets)
Person Graph node — individual entity
Company Graph node — organisation entity
Conflict Detected path between two entities, persisted with path data and reviewer status
Compliance Check Persisted assessment record with risk score, threshold band, triggered BPM, status
Business Process Tenant BPMN XML defining a downstream operational workflow
Rule Risk-scoring rule operating on a schema attribute path
Risk Threshold Tenant-defined band mapping score → risk level → BPM
Screening Result Per-party hit record with similarity, match type, source watchlist, reviewer status
Relationship Description
EMPLOYED_BY Person → Company
OWNS Person → Company
INVESTED_IN Person / Company → Company
REPRESENTS Person → Person / Company
PARTY_TO Person / Company → Case
RELATED_TO Generic catch-all

Schemas are managed centrally through the Essential Knowledge Fabric API. Every compliance artefact (client, rule, business process, compliance check) carries a schema_uuid and schema_name so that rules and processes always operate on known data shapes.


Six Core Capabilities

The remainder of this document describes the six capabilities the utility delivers, framed against the corresponding DataFab core primitives.

1. Graph-Based Conflict Detection

Purpose. Identify potential conflicts of interest by finding paths between two entities (e.g., a prospective client and an existing case party, two opposing parties on a case, a director and a counterparty) within a configurable graph depth.

Aspect Detail
Engine Per-tenant graph, named by organisation
Default Depth 3 hops (configurable)
Trigger Post-enrichment of any new client, case, or party
Output Persisted Conflict record with source / target node names, parent node names + types, path data (JSONB), path length, node-name chain
Reviewer Workflow Status transitions: detected → relevant / irrelevant with reviewer, timestamp, optional notes and summary
Tenant Scoping Each organisation has its own graph; cross-tenant traversal is impossible by construction
Pool / Limits Connection pool size 16; query timeouts (60s normal, 30 minutes for full-conflict-detection sweeps)

Conflict record fields: source_node_name, source_parent_node_name, source_parent_node_type, target_node_name, target_parent_node_name, target_parent_node_type, path_data (JSONB), path_length, node_name_chain, status, summary, notes, reviewed_by, reviewed_at.

Conflict detection runs as part of every enrichment pipeline (company, case, onboarding) and is also runnable on demand. A weekly aggregated conflict report is produced for each tenant.

2. Schema Management for OSINT

Purpose. Let each tenant declare the entity shapes the platform should ingest through OSINT and external sources, so that downstream rules, BPM, and compliance checks all operate on consistent, governed data.

Aspect Detail
Authority Essential Knowledge Fabric (EKF) API (see 09-Schema-Management)
Tenant Binding Schemas are per-organisation; EKF calls scoped by _type: "organization"
Field Paths EKF returns flattened paths (e.g., address.country) for use in rule attribute selection
Bound Artefacts Every client, rule, business_process, and compliance_check row carries schema_uuid + schema_name
Authentication Bearer token (Authorization: Bearer ...)
Resilience Local mock fallback for development (e.g., mock_client.json, mock_company.json)

Why this matters. A rule cannot reference an attribute the active schema does not declare. Adding a new property to a tenant’s compliance surface is a schema operation, not a code change.

3. BPM-Defined Compliance Processes

Purpose. Externalise the operational workflow that should run after a compliance decision (file SAR, request EDD, escalate to Senior Compliance, decline, refer) so that tenants can adjust process logic without code changes.

Aspect Detail
Format BPMN XML stored in object storage, referenced by signed URL
Storage Field business_processes.xml_url
Tenant Scoping Unique (name, organization_uuid) constraint
Lifecycle enabled flag (default true); created_by user audit
Schema Binding schema_uuid + schema_name (so the BPM operates on the same data shape as the rules that triggered it)
Trigger A Risk Threshold row binds a risk level to a business_process_uuid; the compliance pipeline triggers the bound BPM

Why this matters. The platform separates what to assess (rules + thresholds) from what to do about it (BPM). Tenants in different jurisdictions or business lines run different BPMs from the same rule set.

4. Risk Rules and Risk-Level Assessment

Purpose. Convert a client’s attributes (e.g., address.country, business.industry, pep_status) into a numeric risk score by evaluating governed conditions.

Aspect Detail
Attribute A dot-notation path into the client’s schema-bound data (e.g., address.country)
Conditions A list of typed ConditionItems: STRING, NUMBER, or LOGICAL — each carrying an operator, value, weight (0–100), and inter-condition logical operator (AND / OR)
Else Weight Fallback weight applied when no condition matches
Aggregation Per-rule weights are summed across enabled rules; the sum is capped at 100 to produce the final risk score
Tenant Scoping Unique (name, organization_uuid); rules also scoped by schema_uuid
Audit created_by_id and updated_by_id foreign keys to users
Engine RiskWeightCalculatorService evaluates rules in memory

The Compliance utility’s rule model fits cleanly into the Graph Operations › Rule Engine — Pattern Scoring + Threshold rule types. Risk rules can be promoted into a DAG chain when a tenant’s compliance flow needs the chain-trace audit pattern (see Graph Operations › DAG Rules).

5. Risk Thresholds and BPM Binding

Purpose. Map the 0–100 risk score onto a discrete risk level, and from that risk level onto the BPMN process the platform should run.

Risk Level Default Band Typical BPM
LOW 0–25 Auto-approve / simplified workflow
MEDIUM 26–50 Standard CDD / review
HIGH 51–75 Enhanced due diligence (EDD)
VERY_HIGH 76–100 Senior compliance review / escalate / decline

Threshold rules:

Constraint Detail
Range 0 ≤ min_weight ≤ 100, 0 ≤ max_weight ≤ 100, min_weight < max_weight
Uniqueness Unique (organization_uuid, risk_level)
BPM Link Optional business_process_uuid per band

The compliance pipeline reads the calculated score, finds the threshold band whose range contains the score, fetches the bound business_process_uuid, and triggers the BPM. The decision (band + BPM ID) is persisted on the compliance check.

6. Adverse Media and Watchlist Screening

Purpose. Screen case parties against adverse-media indices and sanctions / PEP / watchlist sources, with reviewer triage of hits.

Screening flow:

Step Action
1 Compliance team enables screening on a case (or onboarding event) and selects the watchlists to apply
2 The Screening Service is called per party (entity_name, entity_type, watchlists[])
3 Hits are returned indexed by matcher with match type, confidence level, and score
4 Each hit is persisted as a case_screening_results row with similarity, match type, index name, organisation scope
5 Reviewers triage hits, marking them relevant or irrelevant, with optional notes

Screening result fields: entity_name, case_uuid, organization_uuid, similarity (0–100), match_type, index_name (watchlist source), status (relevant / irrelevant), notes, reviewed_by, reviewed_at.

Watchlist catalogue. The Screening Service exposes the available watchlists at WATCHLIST_BASE_URL/v1/api/list/filter_list/{page}/{limit}. The catalogue lists each watchlist version and its record count, paginated for tenant selection.

Tenant-Specific Screening Lists

Watchlists are catalogued globally by the Screening Service so that the platform can pool feeds (sanctions, PEP, adverse-media) once and offer them to all tenants. Selection is tenant-specific: every screening request explicitly lists the watchlists the tenant has chosen for that check. The platform never broadcasts a request across the whole catalogue. Three layered controls deliver tenant scoping:

Layer Behavior
Catalogue Global list of available watchlists with version + record count, retrieved on demand
Tenant Configuration The tenant nominates the subset of watchlists eligible for compliance use; these are the only watchlists offered to compliance officers in playbook intents
Per-Check Selection The screening request carries the explicit watchlists[] to apply for this specific case or onboarding event

This pattern lets one tenant prioritise OFAC + UK HMT + EU sanctions while another prioritises a sector-specific adverse-media feed, without operating distinct screening services.


Async Pipelines (Compliance Check Lifecycle)

The Compliance utility is implemented as a set of async pipelines orchestrated by the platform’s async-task runner; each pipeline produces persistent records and progress events.

Pipeline Trigger Description
search_entity_data New client created Fetches client data from the Knowledge Fabric for the bound schema, stages to object storage, hands off to compliance_check
compliance_check Client data ready Loads client data; evaluates enabled rules for the bound schema; aggregates weights into a 0–100 score; resolves risk threshold; creates a compliance check record; triggers the bound BPM
company_enrichment New company entity OSINT search + LLM extraction → graph node creation / merge → conflict detection
case_enrichment New case Party-by-party enrichment; optional case-party screening with explicit watchlists; conflict detection
onboarding_enrichment Onboarding event Combined people + parties enrichment; optional screening; conflict detection
weekly_conflict_report Scheduled (weekly) Aggregates conflicts across the tenant for the past 7 days; produces summary; sends notifications
daily_logs_cleanup Scheduled (daily) Prunes old log entries per retention

Per-pipeline progress is broadcast over the platform’s event bus and surfaced over WebSocket to the analyst UI.


Decision Outputs

Output Description
Compliance Check Record Final score, risk level, triggered BPM ID, status, schema reference
Conflict Records Per-conflict rows with path data, parent context, reviewer status
Screening Results Per-hit rows with similarity, match type, index name, reviewer status
Triggered BPM Run Reference to the BPMN process triggered by the assessed risk level
Audit Events Per-step audit records on every persisted decision

Every output is schema-bound, tenant-scoped, and traceable to its triggering pipeline.


Dialog Playbook

The Compliance playbook (see Dialog › Playbook-Based Communication) governs the analyst conversation surface for compliance officers and reviewers.

Intents

Intent Slots Tools Risk Class
list_clients schema, status, risk_level, limit get_clients Low
client_status client_uuid get_client Low
list_cases status, schema, limit get_cases Low
list_hits case_uuid, status get_screening_results Low
list_conflicts client_uuid or case_uuid, status get_conflicts Low
explain_decision compliance_check_uuid compliance check + rule trace Medium
rule_deep_dive rule_uuid get_rule Low
threshold_overview (none) get_thresholds Low
available_watchlists (none) get_sanctions_watchlist Low
tenant_watchlists (none) tenant configuration lookup Low
bpm_for_band risk_level get_business_process Low
schema_introspection schema_uuid describe_schema, flattened_fields Low
conflict_review conflict_uuid, status, notes update_conflict Medium
hit_review result_uuid, status, notes update_screening_result Medium
weekly_report (none) latest weekly conflict aggregate Low

Persona and Guardrails

Setting Default
Persona Concise, evidence-first; tables and case cards preferred; never fabricates rule outcomes, hits, or conflicts
Refuse-List No legal advice; no override of risk-threshold bands; no bypass of HITL gates on conflict / hit review
Escalation Any user request to mutate rules, thresholds, BPM, or watchlist configuration is redirected to compliance governance
Memory Turn + session memory retained; cross-session memory off by default

System Self-Awareness

Within the Compliance utility, Self-Awareness (see Dialog › System Self-Awareness) gives officers these specific capabilities:

Question Tool / Source
“Which schemas does my tenant use?” describe_schema over EKF
“Which rules apply to schema X?” list_rules filtered by schema_uuid
“What does a score of 62 mean here?” get_thresholds
“Which BPM runs for HIGH risk?” get_business_process for the tenant’s HIGH band
“Which watchlists is my tenant configured to use?” tenant watchlist configuration lookup
“What’s available globally?” get_sanctions_watchlist (full catalogue)
“Why did this compliance check produce VERY_HIGH?” compliance check record + rule trace
“What conflicts exist between Person X and Case Y?” get_conflicts filtered by entity
“What’s the freshness of the screening service?” service health introspection

Self-Awareness is read-only; rule, threshold, BPM, and watchlist changes go through compliance governance.


User Roles

Role Responsibilities Access Level
User (Compliance Officer) Day-to-day client / case processing, hit triage, conflict review Read / write on assigned cases and clients
Admin Tenant configuration, rule management, BPM management, watchlist selection Full tenant access
Super Admin Cross-tenant administration, schema operations Full system access

Authentication uses JWT bearer tokens with role claims (USER / ADMIN / SUPER_ADMIN); inter-tenant access is denied unless the role is SUPER_ADMIN.


Operational Modes

Mode Behavior
1 — AI-Assisted Manual Default during pilot; HITL on every conflict review and every screening hit
2 — Routine Automation Auto-clear of LOW-band compliance checks; HITL on MEDIUM / HIGH / VERY_HIGH and on every screening hit
3 — Autonomous with Escalation Auto-progress LOW + MEDIUM cleanly; HIGH / VERY_HIGH escalate; HITL on every screening hit (sanctions sensitivity)

Modes 0 and 4 are not used for this utility. Sanctions strong-matches and HIGH / VERY_HIGH risk bands always retain a HITL gate by default.


Security Controls

Control Implementation
Tenant Isolation Every artefact (client, case, rule, BPM, conflict, screening result, compliance check) carries organization_uuid; cross-tenant access is denied at every layer
Per-Tenant Graph Each organisation has its own graph; conflict detection cannot traverse across tenants
Schema Bounding All rule attributes, BPM bindings, and compliance checks reference a schema_uuid from the tenant’s schema catalogue
Watchlist Catalogue Isolation Every screening request explicitly lists the watchlists to apply; the platform never broadcasts to the full catalogue
Rule Governance Rule changes are authored, owner-attributed, and uniquely named per tenant
BPM Versioning BPMN XML stored in object storage with version-controlled signed URLs; tenants cannot reference others’ BPMs
Threshold Constraints Risk threshold rows enforce min < max and 0 ≤ × ≤ 100 at the database
Service-to-Service Auth EKF (Bearer), Org Mgmt (X-System-API-Key), Screening (system credentials); credentials in encrypted vault
LLM Provider Isolation All LLM calls flow through the LLM Router with provenance (see AI & LLM)
HITL Gates Conflict review, screening-hit review, and HIGH / VERY_HIGH compliance checks gated by HITL
Tool Authorization All dialog tools subject to 04-Studio Tool Authorization
Data Persistence Sensitive client data persisted in object storage with versioned, signed URLs; relational store holds compliance metadata; graph database holds the knowledge graph

Audit Logging

Event Retention
Client Created 5 years
Client Data Refreshed 5 years
Compliance Check Issued 5 years
Rule Created / Modified 5 years
BPM Created / Modified / Triggered 5 years
Risk Threshold Modified 5 years
Conflict Detected 5 years
Conflict Reviewed 5 years
Screening Run 5 years
Screening Hit Reviewed 5 years
Watchlist Catalogue Refreshed 1 year
Authentication / Authorisation 1 year
Configuration Change (Tenant Watchlists, Schemas) 5 years
Pipeline Started / Completed / Failed 1 year

Logs use immutable tamper-evident storage with PII redaction. Compliance audit logs are exportable for regulatory review.


Reference Implementation: Conflict Management Platform

Surface area. Multi-tenant application with a per-tenant graph database, async background workers, Pub/Sub for progress events, bearer-token authentication with USER / ADMIN / SUPER_ADMIN roles, and object-storage-backed BPMN definitions and client data.

Key external integrations:

Integration Endpoint Pattern Auth Role
Essential Knowledge Fabric API BASE_URL/api/schemas/... Bearer Schema CRUD, field-path flattening, tenant client data
Organization Management API (organization-management service) X-System-API-Key Tenant + user metadata
Screening API BASE_URL/v1/api/screening/search, WATCHLIST_BASE_URL/v1/api/list/filter_list/{page}/{limit} System credentials Adverse-media + watchlist catalogue and search
OSINT Search REST API key OSINT search for entity enrichment
LLM (LLM Gateway) LLM provider Bearer / API key Entity extraction from OSINT content

Background workers:

Task Purpose
search_entity_data_task Fetch client data from the Knowledge Fabric; stage to object storage; trigger compliance check
compliance_check_task Evaluate rules → score → threshold → compliance check record → BPM trigger
company_enrichment_task OSINT-driven company enrichment + conflict detection
case_enrichment_task Case-party enrichment + optional screening + conflict detection
onboarding_enrichment_task Combined onboarding enrichment + optional screening + conflict detection
organization_entity_enrichment_task Configurable-depth org / company enrichment
weekly_conflict_report Tenant-scoped weekly conflict aggregation and notification
daily_logs_cleanup Retention enforcement on log records

Storage:

Store Purpose
Relational Database Compliance metadata: clients, cases, rules, BPM rows, compliance checks, conflicts, screening results, risk thresholds, change logs
Graph Database (per tenant) Knowledge graph for conflict detection
Cache + Event Bus Cache and Pub/Sub for pipeline progress, rate limiting
Object Store BPMN XML, client data records (versioned, signed)

Regulatory Mapping

Regulator / Regulation Supported Capability
FCA (UK) Sanctions and PEP screening; adverse-media monitoring; ongoing CDD; conflict-of-interest detection
OFAC (US) Sanctions screening when OFAC SDN is in the tenant’s selected watchlists
EU Sanctions / Restrictive Measures Sanctions screening when EU lists are in the tenant’s selected watchlists
UN Sanctions Sanctions screening when UN lists are in the tenant’s selected watchlists
MLR 2017 / 2022 (UK) Customer due diligence, risk-based approach, ongoing monitoring
6AMLD (EU) AML / CFT compliance; risk assessment and tiered customer due diligence
FATF Recommendations Risk-based approach, ongoing monitoring
SRA Conflict-of-Interest Rules (UK Legal) Conflict-of-interest detection across cases, parties, and clients
GDPR / UK GDPR Data subject rights, data minimisation, schema-bounded extraction
EU AI Act Transparency, human oversight on high-impact decisions, technical documentation
SOC 2 Audit trail, change control, access control

Cross-References