02

Deployment models


Three paths, same components. Pick the one that fits your operating model — we keep the runtime identical so evidence and exports look the same to a reviewer regardless of who hosts.

01 · Hosted pilot DataSitr-hosted pilot We provision and manage a Saudi-hosted environment with shared-state services where required. Fastest path to a working pilot. 1–2 days
02 · Customer cloud Your Saudi-hosted cloud Your team deploys on your Saudi-hosted infrastructure. Two paths supported: ACK guided Kubernetes, or single-VPS Docker Compose. 1–3 days
03 · On-premises Your datacenter Same components on your hardware. No cloud dependency. Operator owns network egress, backup target, and identity provider. 1–3 days
03

Architecture


Every request follows the same path: TLS edge, then the API runtime that detects personal data, applies your policy, and chooses the allowed route. State, audit, and signed evidence live alongside the API on Saudi infrastructure.

DataSitr request flow architecture Vertical flow from client through TLS edge, the API runtime with detector, policy, and router blocks, the state stores beneath, and the lane-based provider routing at the bottom. Client app / SDK REST · API key OR OIDC session TLS edge · nginx + Let's Encrypt HSTS · request size cap · health probes DataSitr API · FastAPI · 2× replicas Python 3.12 · Uvicorn · Saudi region Detector Arabic NER · heuristics finds PII · sensitive ctx Policy engine tenant policy · scope lawful basis · purpose Router lane decision · audit log provider selection PostgreSQL sessions · attempts · audit hash-chained records Redis shared state · rate limits cross-replica coordination Object storage encrypted backups signed evidence outbox Lane decision → AI provider GREEN — tokenized external AI AMBER — pseudonymized in-Kingdom AI RED — raw in-Kingdom only / blocked

Request flow · Saudi-hosted runtime · 2026 baseline

The runtime is identical across hosted-pilot, customer-cloud, and on-premises models. Helm guards 2× replicas with shared Postgres + Redis when high-availability mode is on; a single-VPS Docker Compose path is supported for early pilots that don't need it.

04

Three-lane routing in detail


The router's lane decision determines what the downstream AI provider sees. Each lane has explicit rules, an explicit destination, and an explicit audit trail.

GREEN Tokenized · external AI PII is replaced with typed placeholders before the request leaves the gateway. The external model never sees real identifiers, dates, or location names — only shape-preserving tokens. rehydration on response · placeholder map kept in-Kingdom
AMBER Pseudonymized · in-Kingdom AI Linkable pseudonyms substitute identifiers but the prompt stays inside Saudi infrastructure. Use this when the workload needs internal context the green lane would strip. tenant policy controls when this overrides green
RED Raw · in-Kingdom only or blocked Sensitive categories (health, national ID combinations, financial accounts) stay raw and route only to operator-configured in-Kingdom paths. If no in-Kingdom path is configured, the request is blocked. default sovereignty posture · operator opts in to paths
05

What gets deployed


The same five components run in every deployment model. Sizes scale with traffic; the topology stays the same.

datasitr-api FastAPI · Python 3.12 · 2× replicas with shared state. Hosts the detector, policy engine, router, and audit writer.
dashboard SPA React + Vite · pre-built static bundle served by the API container. Operator UI, regulator portal, academy.
postgres Operational data store · sessions, attempts, audit, hash-chained compliance records. Daily backup with restore drill.
redis Coordination layer · shared cache for rate-limits, OIDC sessions, and cross-replica state. Stateless after restart.
object storage Encrypted backups + signed evidence outbox. Saudi-hosted bucket; off-host copy for regulator-readable export packages.
nginx (TLS) Edge layer · TLS termination, HSTS, request size cap, health probes. Let's Encrypt or operator-supplied certificate.
06

Runtime options


Runtime
When to choose
Notes
Alibaba ACK
Kubernetes · production

Production deployments needing horizontal scaling, rolling updates, and managed control plane.

Guided deploy script · helm chart · 2× replicas with shared Postgres + Redis · health probes and recovery gates.

Docker + Compose
single VPS · pilot

Early pilots and on-premises installs that don't need horizontal scaling yet.

Single command, dashboard build, health checks. SQLite-backed sessions when Postgres is not yet provisioned.

systemd
bare host · constrained

Environments where Docker is not available (regulated estates, air-gapped reviewers).

Requires Python 3.12+ and Node.js 20+ on the host. Operator owns process supervision and log rotation.

07

Authentication


Method
Detail
API key
default · machine actors

Bearer token in the Authorization header. Keys carry the sv_ prefix and are role-scoped (tenant / tenant_admin / super_admin / regulator).

OIDC SSO
optional · human actors

Authorization Code + PKCE flow against the operator's corporate IdP. Person-bound identity for every action — required for individual training records and per-user audit attribution.

08

Infrastructure requirements


Minimum (pilot)
Recommended (production)

2 vCPU

4 GB RAM

20 GB SSD

Ubuntu 22.04+

Saudi region

single-VPS Docker Compose path

4+ vCPU per node · 2 nodes

8+ GB RAM per node

50+ GB SSD per node

Shared Postgres + Redis

Off-host backup + scheduled restore drill

Active alert delivery to operator on-call

09

What gets verified


The runtime emits dated, hash-chained, regulator-readable evidence. Operator-refreshed controls — not timeless freshness guarantees.

  • Signed export packages Time-bounded, scope-bounded JSON manifests with key fingerprint and hash-chain verification — what a regulator opens first.
  • Audit trail Every lane decision, every rights-request action, every key invocation — recorded once with a route reason, time, and actor identity.
  • Training records The dashboard's built-in academy emits dated, exportable training records by named actor (when OIDC) or by shared key — distinguished and attributable.
  • RoPA + transfer register Article 31 / Article 29 anchor surfaces with dated entries that link each transfer to its lawful basis and TRA reference.
  • Subject-rights queue Verification, fulfilment, and dated closure evidence linked to a single workflow — auditable end-to-end without leaving the platform.
  • Breach register PDPL Implementing Regulation Article 24 fields — detection time, awareness time, affected categories, supervisory-authority notification timestamp — with a 72-hour clock from awareness, per PDPL Article 20.
10

What this is — and isn't


Current pilot runtime uses a Saudi-hosted shared-state layout. Dated proof covers scaling beyond a single-process setup, and dated alert-delivery and backup-plus-restore evidence is operator-refreshed. Treat those as controls the operator maintains, not as timeless freshness guarantees.

Optional immutable-evidence retention can strengthen audit evidence when configured. That is a software-level control — it should not be read as hardware-backed immutability or as a high-availability claim.

DataSitr is registered with NDGP as a data services / products provider (LR-25-000018, status Complete) — not licensed. SDAIA AI Service Provider Accreditation (AE-26-000237) remains in progress at the time of writing. References to PDPL articles describe operational alignment; they do not imply regulator approval.


See it work on your data.

Evaluate →