Demonstration prototype · anonymous · no PHI/PII processed · not for clinical use

Architecture

Eight layers. Each portable. No vendor lock.

Demo runs on Cloudflare Workers production tenancy. Each surface has a direct peer in Azure and Azure GovCloud (FedRAMP High / IL4-IL5). The lift on Phase I award is a deploy-config change, not a re-architecture.

Request flow · top-to-bottom

1 · Edge / TLS termination Today: Cloudflare custom hostname + ACM cert · lm4vsp-demo.drjonesy.com Phase I → Azure: Front Door + Azure-managed TLS 2 · Serverless API · request router Today: Workers (V8 isolates, modules format) · single-file index.js Phase I → Azure: Functions Premium plan (GovCloud-resident) 3 · Inline UI templates · /, /check-in, /peer-ring, … Today: Worker-rendered HTML · Tailwind CDN · vanilla JS Phase I → Azure: Static Web Apps · same templates 4 · LLM inference · /api/checkin Today: Workers AI · @cf/meta/llama-3.3-70b-instruct-fp8-fast · JSON Mode strict schema Phase I → Azure: Azure OpenAI Service (GPT-4.1 / GPT-5) · or on-prem Llama 3.3 in GovCloud 5 · Crisis classifier · RACE-framework prompt Conservative bias · C-SSRS / Joiner ITS framework citations · structured JSON output Phase I: classifier output never directly drives escalation — see layer 7 6 · Peer-ring state · per-ring isolation Today: Sample data · localStorage · UI surface complete Phase I → Azure: Cosmos DB single-partition logical actor + Functions Durable Entities 7 · 5-min escalation timer · deterministic · server-side Today: Client-side countdown for UX demo Phase I → Azure: Functions Durable Timer · cancellable ONLY by authenticated peer ack · no model override 8 · Veterans Crisis Line surface · second-line fallback Today + Phase I: 988 + press 1 · text 838255 · veteranscrisisline.net/chat Audit trace: vcl_shown · reason: countdown-expired · metadata-only · no message content Audit trace · structured telemetry · best-effort · never blocks UX Today: D1 events table · Logpush optional · session_id + event_name only Phase I → Azure: Cosmos DB events container · Application Insights · Log Analytics workspace Hard architectural rule No LLM output can bypass the deterministic peer-first → 5-min window → VCL escalation. Escalation rule is data, not code-path inferred from model output. Human-in-the-loop guaranteed at the architecture level.

Surface-by-surface portability map

Each surface has a direct peer in Azure. The lift is a deploy-config change.

Surface Demo today (Cloudflare) Phase I (Azure GovCloud)
Serverless API Workers (V8 isolates, modules) Functions Premium plan (GovCloud)
LLM inference Workers AI · Llama 3.3 70B fp8-fast Azure OpenAI Service (GPT-4.1 / GPT-5)
Session state D1 (SQLite at edge) Cosmos DB (SQL API)
Object storage R2 (S3-compatible) Blob Storage (hot tier)
Identity / auth Cloudflare Access (OIDC) Entra External ID · CAC-PIV federation
Custom domain · TLS CF custom hostname + ACM cert Front Door + managed cert
Telemetry · audit Logpush · Tail Workers · D1 events Application Insights · Log Analytics · Cosmos events
Secrets Wrangler env / KV secrets Key Vault (HSM-backed in GovCloud)
Per-ring state Durable Objects (single-writer) Cosmos DB single-partition + Durable Entities
Push notifications Web Push + APNs/FCM Azure Notification Hubs (APNs/FCM/WNS)

DoD Responsible AI principles · mapped to architecture

Defense Innovation Board principles operationalized as architectural decisions, not as policy statements.

Responsible

Crisis flag triggers a deterministic, hard-coded escalation rule. No LLM output bypasses or overrides it.

Equitable

RACE prompt scaffold parameterized on era of service, MOS, separation status. Phase I evaluation set stratified on under-represented populations.

Traceable

Every classification ships with a framework citation, the indicators that triggered it, and the model's reasoning. Independent clinician auditor can reconstruct rationale from the trace alone.

Reliable

Conservative-bias prompt. JSON output strictly schema-validated. Parser failure surfaces to "elevated" by default — system never fails silent on a possible crisis.

Governable

Operator override at every UI surface. Escalation rule is data, not code-path inferred from model output. Audit log writes are best-effort and never block UX, but every state transition is captured for governance review.

Full architecture appendix at /api/architecture (machine-readable JSON · evaluator-friendly).