Skip to content

Latest commit

 

History

History
1124 lines (867 loc) · 41.1 KB

File metadata and controls

1124 lines (867 loc) · 41.1 KB

Protocol Guide Database Schema

Last Updated: 2026-02-07
Database System: PostgreSQL 17 (Supabase)
ORM: Drizzle ORM
Source of Truth: /drizzle/schema.ts


Table of Contents

  1. Overview
  2. EMS Jurisdiction Data Model
  3. Tables
  4. Enums
  5. Relationships
  6. Indexes
  7. Row Level Security (RLS)
  8. Database Functions
  9. Triggers
  10. Vector/Embedding Storage

Overview

Protocol Guide uses a PostgreSQL database hosted on Supabase to store the national EMS protocol library — 58,000+ protocol chunks from 2,738 agencies across 53 states and territories, with vector embeddings for jurisdiction-scoped semantic search.

  • Primary Database: Supabase PostgreSQL 17
  • Schema Management: Drizzle ORM migrations
  • Vector Search: pgvector extension with 1536-dimension embeddings (Gemini Embedding 2 Preview gemini-embedding-2-preview; Voyage AI removed 2026-03-24)
  • Security: Row Level Security (RLS) policies for data isolation and PHI exposure reduction
  • Full-Text Search: PostgreSQL tsvector with GIN indexes

EMS Jurisdiction Data Model

The database models the U.S. EMS jurisdiction hierarchy. Understanding this is essential — every search query is scoped through this chain:

User selects County → county_agency_mapping → agency_id → manus_protocol_chunks

Jurisdiction Flow

┌──────────────────────┐       ┌──────────────────────────┐
│   manus_agencies     │       │  manus_protocol_chunks   │
│ (LEMSAs, state EMS   │       │ (58,000+ vector-embedded │
│  offices, regional   │       │  protocol segments)      │
│  authorities)        │       ├──────────────────────────┤
├──────────────────────┤       │ agency_id (FK) ──────────┤──► scopes search
│ id (PK)              │◄──────│ agency_name              │
│ name                 │       │ state_code               │
│ state_code           │       │ protocol_number          │
│ state_name           │       │ protocol_title           │
│ protocol_count       │       │ section                  │
│ integration_partner  │       │ content                  │
└──────────────────────┘       │ embedding (vector 1536)  │
        ▲                      │ protocol_year            │
        │                      │ protocol_effective_date   │
┌───────┴──────────────┐       └──────────────────────────┘
│ county_agency_mapping│
│ (bridges user county │
│  selection to the    │
│  correct LEMSA)      │
├──────────────────────┤
│ county_id (FK)       │
│ agency_id (FK)       │  ◄── One LEMSA may cover multiple counties
│ agency_name          │      (e.g., Central CA EMS → Fresno, Kings,
│ state_code           │       Madera, Tulare counties)
└──────────────────────┘
        │
        ▼
┌──────────────────────┐
│     counties         │
│ (2,713 U.S. counties)│
├──────────────────────┤
│ id (PK)              │
│ name                 │
│ state                │
│ uses_state_protocols │  ◄── TRUE = county uses statewide protocols
│ protocol_version     │       (not regional LEMSA protocols)
└──────────────────────┘

Why This Structure Exists

EMS protocols are not national. Each LEMSA (Local Emergency Medical Services Agency) independently writes and publishes clinical protocols for its jurisdiction. A paramedic in Los Angeles County follows different medication dosages, standing orders, and treatment algorithms than one in San Diego County.

The county_agency_mapping table exists because:

  1. Users think in terms of "I work in Fresno County" (county)
  2. Protocols are organized by LEMSA/agency (not county)
  3. One LEMSA often covers multiple counties
  4. The mapping resolves this mismatch: county_id → agency_id → scoped protocol search

Key Relationships

From To Relationship Purpose
county_agency_mapping.county_id counties.id Many-to-1 User's county selection
county_agency_mapping.agency_id manus_agencies.id Many-to-1 Maps county to protocol authority
manus_protocol_chunks.agency_id manus_agencies.id Many-to-1 Scopes protocols to their authority
manus_users.selected_agency_id manus_agencies.id Many-to-1 User's persistent jurisdiction preference

Database Architecture

┌───────────────────────────────────────────────────────────────────┐
│                      Supabase PostgreSQL                          │
├───────────────────────────────────────────────────────────────────┤
│ Jurisdiction Tables    │ User Tables        │ Vector Search       │
│ ──────────────────     │ ────────────       │ ─────────────       │
│ manus_agencies (2,738) │ manus_users        │ manus_protocol_     │
│ counties (2,713)       │ agencies           │   chunks (58,000+)  │
│ county_agency_mapping  │ agency_members     │ pgvector extension  │
│                        │ subscriptions      │ search_manus_       │
│ Analytics Tables       │ queries            │   protocols (RPC)   │
│ ──────────────────     │ bookmarks          │                     │
│ analytics_events       │ feedback           │                     │
│ search_analytics       │                    │                     │
│ content_gaps           │                    │                     │
└───────────────────────────────────────────────────────────────────┘

Tables

Core Tables

manus_users

EMS provider accounts. Each user has a selected_agency_id linking them to their LEMSA/agency jurisdiction. Maps to users export in schema.

Column Type Nullable Default Description
id serial NO auto Primary key
auth_id uuid YES - Supabase auth.users.id
manus_open_id text YES - Legacy Manus OAuth ID
name text YES - Display name
email text YES - Email address
login_method text YES - OAuth provider used
role text YES 'user' User role (user/admin)
created_at timestamptz YES now() Account creation
updated_at timestamptz YES now() Last update
last_signed_in timestamptz YES - Last login timestamp
tier text YES 'free' Subscription tier
query_count_today integer YES 0 Daily query count
last_query_date date YES - Date of last query
selected_agency_id integer YES - Currently selected agency
stripe_customer_id text YES - Stripe customer ID
subscription_id text YES - Stripe subscription ID
subscription_status text YES - active/canceled/etc
subscription_end_date timestamptz YES - Subscription expiry

agencies

EMS agencies and organizations (B2B accounts for agency-level features). Distinct from manus_agencies which is the national registry of protocol-issuing authorities used for search scoping.

Column Type Nullable Default Description
id serial NO auto Primary key
name varchar(255) NO - Agency name
slug varchar(100) NO - URL-safe identifier
state_code varchar(2) NO - Two-letter state code
state varchar(2) YES - State code (duplicate)
county varchar(100) YES - County name
agency_type agency_type YES - Type of agency
logo_url varchar(500) YES - Logo URL
contact_email varchar(320) YES - Contact email
contact_phone varchar(20) YES - Contact phone
address text YES - Physical address
supabase_agency_id integer YES - Legacy mapping ID
stripe_customer_id varchar(255) YES - Stripe customer ID
subscription_tier subscription_tier YES 'starter' Subscription level
subscription_status varchar(50) YES - Subscription status
settings json YES - Agency settings JSON
created_at timestamp NO now() Creation timestamp
updated_at timestamp NO now() Last update

Indexes:

  • idx_agencies_slug on (slug)
  • idx_agencies_state on (state)
  • idx_agencies_state_code on (state_code)

agency_members

Agency membership and roles.

Column Type Nullable Default Description
id serial NO auto Primary key
agency_id integer NO - FK to agencies.id
user_id integer NO - FK to manus_users.id
role member_role NO 'member' Member role
invited_by integer YES - FK to inviting user
invited_at timestamp YES - Invitation timestamp
accepted_at timestamp YES - Acceptance timestamp
status member_status YES 'pending' Membership status
created_at timestamp NO now() Creation timestamp

Indexes:

  • idx_agency_members_agency on (agency_id)
  • idx_agency_members_user on (user_id)

protocol_chunks

Legacy protocol storage (county-based).

Column Type Nullable Default Description
id serial NO auto Primary key
county_id integer NO - FK to counties.id
protocol_number varchar(50) NO - Protocol identifier
protocol_title varchar(255) NO - Protocol title
section varchar(255) YES - Category/section
content text NO - Protocol text content
source_pdf_url varchar(500) YES - Source PDF URL
created_at timestamp NO now() Creation timestamp
protocol_effective_date varchar(20) YES - Effective date
last_verified_at timestamp YES - Last verification
protocol_year integer YES - Protocol year

Indexes:

  • idx_protocols_county on (county_id)
  • idx_protocols_section on (section)
  • idx_protocols_number on (protocol_number)
  • idx_protocols_year on (protocol_year)

counties

Geographic county reference data.

Column Type Nullable Default Description
id serial NO auto Primary key
name varchar(255) NO - County name
state varchar(64) NO - State name
uses_state_protocols boolean NO - Uses state protocols
protocol_version varchar(50) YES - Protocol version
created_at timestamp NO now() Creation timestamp

Indexes:

  • idx_counties_state on (state)

queries

User query history (AI-powered searches).

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
county_id integer NO - FK to counties.id
query_text text NO - User's query
response_text text YES - AI response
protocol_refs json YES - Referenced protocols
created_at timestamp NO now() Query timestamp

bookmarks

User-saved protocol bookmarks.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
protocol_number varchar(50) NO - Protocol identifier
protocol_title varchar(255) NO - Protocol title
section varchar(255) YES - Section bookmarked
content text NO - Bookmarked content
agency_id integer YES - FK to agencies.id
agency_name varchar(255) YES - Agency name
created_at timestamp NO now() Bookmark timestamp

feedback

User feedback and error reports.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
category feedback_category NO - Feedback type
protocol_ref varchar(255) YES - Related protocol
county_id integer YES - FK to counties.id
subject varchar(255) NO - Feedback subject
message text NO - Feedback message
status feedback_status NO 'pending' Review status
admin_notes text YES - Admin notes
created_at timestamp NO now() Submission time
updated_at timestamp NO now() Last update

contact_submissions

Contact form submissions.

Column Type Nullable Default Description
id serial NO auto Primary key
name varchar(255) NO - Submitter name
email varchar(320) NO - Contact email
message text NO - Message content
status contact_status NO 'pending' Review status
created_at timestamp NO now() Submission time

audit_logs

Audit trail designed for PHI-minimizing, HIPAA-aligned operations.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer YES - FK to manus_users.id
action varchar(50) NO - Action performed
entity_type varchar(50) YES - Entity type affected
entity_id varchar(100) YES - Entity ID affected
metadata json YES - Additional data
ip_address varchar(45) YES - Client IP
user_agent text YES - Client user agent
created_at timestamp NO now() Action timestamp

Indexes:

  • idx_audit_logs_user on (user_id)
  • idx_audit_logs_action on (action)
  • idx_audit_logs_created on (created_at)

integration_logs

Partner integration analytics with patient-age/impression fields omitted and a HIPAA-aligned safeguard posture.

Column Type Nullable Default Description
id serial NO auto Primary key
partner integration_partner NO - Partner name
agency_id varchar(100) YES - Partner agency ID
agency_name varchar(255) YES - Partner agency name
search_term varchar(500) YES - Search query
response_time_ms integer YES - Response latency
result_count integer YES - Results returned
ip_address varchar(45) YES - Client IP
user_agent varchar(500) YES - Client user agent
created_at timestamp NO now() Log timestamp

Note: PHI fields (userAge, impression) were intentionally removed to reduce PHI exposure.

Indexes:

  • idx_integration_logs_partner on (partner)
  • idx_integration_logs_created_at on (created_at)
  • idx_integration_logs_agency_id on (agency_id)

user_auth_providers

OAuth provider connections.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
provider varchar(50) NO - OAuth provider name
provider_user_id varchar(255) NO - Provider's user ID
access_token text YES - OAuth access token
refresh_token text YES - OAuth refresh token
expires_at timestamp YES - Token expiry
created_at timestamp NO now() Link creation
updated_at timestamp NO now() Last update

Indexes:

  • idx_auth_providers_user on (user_id)
  • idx_auth_providers_provider on (provider, provider_user_id)

user_counties

User-county associations.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
county_id integer NO - FK to counties.id
is_primary boolean YES false Primary county flag
created_at timestamp NO now() Association created

Indexes:

  • idx_user_counties_user on (user_id)
  • idx_user_counties_county on (county_id)

user_states

User state subscriptions.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
state_code varchar(2) NO - Two-letter state code
access_level access_level YES 'view' Access permissions
subscribed_at timestamp YES now() Subscription start
expires_at timestamp YES - Subscription expiry

Indexes:

  • idx_user_states_user on (user_id)
  • idx_user_states_state on (state_code)

user_agencies

User-agency subscriptions.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
agency_id integer NO - FK to agencies.id
access_level access_level YES 'view' Access permissions
is_primary boolean YES false Primary agency flag
role varchar(100) YES - Role in agency
verified_at timestamp YES - Verification date
subscribed_at timestamp YES now() Subscription start
expires_at timestamp YES - Subscription expiry

Indexes:

  • idx_user_agencies_user on (user_id)
  • idx_user_agencies_agency on (agency_id)

search_history

User search history for cloud sync.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
county_id integer YES - FK to counties.id
search_query text NO - Search query text
results_count integer YES - Number of results
created_at timestamp NO now() Search timestamp

Indexes:

  • idx_search_history_user on (user_id)
  • idx_search_history_created on (created_at)

protocol_versions

Protocol version control for agencies.

Column Type Nullable Default Description
id serial NO auto Primary key
agency_id integer NO - FK to agencies.id
protocol_number varchar(50) NO - Protocol identifier
title varchar(255) NO - Protocol title
version varchar(20) NO - Version string
status protocol_status NO 'draft' Publication status
source_file_url varchar(500) YES - Source file URL
effective_date timestamp YES - Effective date
expires_date timestamp YES - Expiration date
approved_by integer YES - FK to approving user
approved_at timestamp YES - Approval timestamp
published_at timestamp YES - Publication time
published_by integer YES - FK to publishing user
chunks_generated integer YES 0 Number of chunks
metadata json YES - Additional metadata
change_log text YES - Version changes
created_at timestamp NO now() Creation time
created_by integer NO - FK to creating user
updated_at timestamp NO now() Last update

Indexes:

  • idx_protocol_versions_agency on (agency_id)
  • idx_protocol_versions_status on (status)

protocol_uploads

Protocol file upload tracking.

Column Type Nullable Default Description
id serial NO auto Primary key
agency_id integer NO - FK to agencies.id
user_id integer NO - FK to uploading user
file_name varchar(255) NO - Original filename
file_url varchar(500) NO - Storage URL
file_size integer YES - File size in bytes
mime_type varchar(100) YES - MIME type
status upload_status YES 'pending' Processing status
progress integer YES 0 Processing progress %
chunks_created integer YES 0 Chunks generated
error_message text YES - Error details
processing_started_at timestamp YES - Processing start
completed_at timestamp YES - Completion time
created_at timestamp NO now() Upload time

Indexes:

  • idx_protocol_uploads_agency on (agency_id)
  • idx_protocol_uploads_user on (user_id)

stripe_webhook_events

Stripe payment webhook processing.

Column Type Nullable Default Description
id serial NO auto Primary key
event_id varchar(255) NO - Stripe event ID
event_type varchar(100) NO - Event type
payload json YES - Event payload
processed boolean YES false Processing status
processed_at timestamp YES - Processing time
error text YES - Error message
created_at timestamp NO now() Receipt time

Indexes:

  • idx_stripe_events_id on (event_id)
  • idx_stripe_events_type on (event_type)
  • idx_stripe_events_processed on (processed)

push_tokens

Push notification tokens.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
token text NO - Push token
platform varchar(20) YES - ios/android/web
created_at timestamp NO now() Registration time
last_used_at timestamp NO now() Last used

Indexes:

  • push_tokens_user_idx on (user_id)

drip_emails_sent

Email campaign tracking.

Column Type Nullable Default Description
id serial NO auto Primary key
user_id integer NO - FK to manus_users.id
email_type varchar(50) NO - Email template type
sent_at timestamp NO now() Send timestamp

Indexes:

  • drip_emails_user_idx on (user_id)
  • drip_emails_type_idx on (email_type)

waitlist_signups

Pre-launch email capture.

Column Type Nullable Default Description
id serial NO auto Primary key
email varchar(320) NO - Email address
source varchar(100) YES 'landing_page' Signup source
created_at timestamp NO now() Signup time

Indexes:

  • waitlist_signups_email_idx on (email)
  • waitlist_signups_created_idx on (created_at)

Analytics Tables

See /drizzle/analytics-schema.ts for full definitions.

analytics_events

Generic event tracking.

Column Type Description
id serial Primary key
user_id integer Optional user reference
session_id varchar(64) Session identifier
event_type varchar(50) search/protocol/user/conversion
event_name varchar(100) Specific event name
properties json Event properties
device_type varchar(20) ios/android/web/pwa
timestamp timestamp Event time

search_analytics

Detailed search behavior tracking.

Column Type Description
id serial Primary key
user_id integer Optional user reference
session_id varchar(64) Session identifier
query_text varchar(500) Search query
state_filter varchar(2) State filter applied
results_count integer Number of results
selected_result_rank integer Which result clicked
search_method varchar(20) text/voice/example_click
timestamp timestamp Search time

protocol_access_logs

Protocol viewing analytics.

Column Type Description
id serial Primary key
user_id integer Optional user reference
protocol_chunk_id integer FK to protocol chunk
access_source varchar(50) search/history/bookmark/deep_link
time_spent_seconds integer View duration
scroll_depth real 0-1 scroll percentage
timestamp timestamp Access time

session_analytics

Session-level usage tracking.

Column Type Description
id serial Primary key
user_id integer Optional user reference
session_id varchar(64) Unique session ID
device_type varchar(20) Device type
start_time timestamp Session start
end_time timestamp Session end
search_count integer Searches in session
protocols_viewed integer Protocols viewed

daily_metrics

Pre-aggregated daily metrics for dashboards.


retention_cohorts

User retention cohort analysis.


content_gaps

Zero-result searches for content improvement.


conversion_events

Subscription conversion funnel tracking.


feature_usage_stats

Aggregated feature usage statistics.


Supabase-Only Tables

These tables exist only in Supabase and are not in the Drizzle schema.

manus_protocol_chunks

The search corpus — 58,000+ vector-embedded protocol segments from LEMSAs and EMS agencies nationwide. Each row is a semantically bounded chunk (400-1800 chars) of a clinical protocol, tagged with its jurisdiction (agency_id + state_code) and embedded via Gemini Embedding 2 Preview for cosine similarity search (Voyage AI removed 2026-03-24).

This table is the core of the RAG pipeline. Every search query generates an embedding, then searches this table filtered by agency_id (from the user's county→agency mapping) to return jurisdiction-specific protocol content.

Column Type Description
id integer Primary key
agency_id integer FK to manus_agencies — scopes chunk to its LEMSA/agency
agency_name text Denormalized agency name (e.g., "LA County EMS Agency")
state_code char(2) Denormalized state code (e.g., "CA") — enables state-level filtering
protocol_number text LEMSA-assigned protocol identifier (e.g., "R-001", "TP-4.2")
protocol_title text Clinical protocol name (e.g., "Cardiac Arrest - Adult")
section text Protocol category: Cardiac, Respiratory, Trauma, Medical, Pediatric
content text Protocol text chunk (400-1800 chars, 150 char overlap with adjacent chunks)
embedding vector(1536) Gemini Embedding 2 Preview gemini-embedding-2-preview embedding for cosine similarity search (Voyage AI removed 2026-03-24)
source_pdf_url text Original PDF source URL from LEMSA website
protocol_year text Protocol version year (e.g., "2024")
protocol_effective_date date When protocol became clinically active
has_images boolean Whether source protocol contains clinical images
image_urls jsonb Array of extracted image URLs
content_type text Chunk classification: medication, procedure, assessment, general
search_vector tsvector Full-text search vector (BM25 fallback)
embedding_version text Embedding model version for migration tracking
created_at timestamptz Ingestion timestamp
last_verified_at timestamptz Last verification against source PDF

Indexes (jurisdiction-optimized):

  • idx_manus_chunks_embedding_hnsw — HNSW vector index for cosine similarity (primary search path)
  • idx_manus_chunks_agency_id — Agency filtering (used in every jurisdiction-scoped search)
  • idx_manus_chunks_state_code — State-level filtering (used when searching all agencies in a state)
  • idx_manus_chunks_state_agency — Composite state + agency (most selective filter)
  • idx_manus_chunks_protocol_number — Direct protocol number lookup
  • idx_manus_chunks_search_vector — GIN full-text search (BM25 fallback when Gemini embeddings unavailable)

manus_agencies

National registry of protocol-issuing authorities (LEMSAs, state EMS offices, regional councils). This is the primary jurisdiction table — 2,738 agencies across 53 states/territories. Every protocol chunk references an agency_id from this table, and every search query is scoped by it.

In California, each row represents a LEMSA (Local Emergency Medical Services Agency). In other states, rows may represent state EMS offices, regional medical direction councils, or fire department-based agencies.

Column Type Description
id integer Primary key — the agency_id referenced throughout the system
name text Agency/LEMSA name (e.g., "Los Angeles County EMS Agency")
state_code char(2) Two-letter state code (e.g., "CA", "TX")
state_name text Full state name
protocol_count integer Number of protocol chunks ingested for this agency
parent_protocol_source_id integer FK for protocol inheritance (agencies sharing base protocols)
agency_type agency_type_enum Classification: fire_dept, ems_agency, hospital, state_office, regional_council
call_volume_tier call_volume_tier_enum high/mid/low — used for ingestion prioritization
is_verified boolean Whether protocol data has been verified against source
integration_partner integration_partner_enum ePCR vendor: imagetrend, esos, zoll, emscloud

Enums

-- Contact/Feedback Status
CREATE TYPE contact_status AS ENUM ('pending', 'reviewed', 'resolved');
CREATE TYPE feedback_category AS ENUM ('error', 'suggestion', 'general');
CREATE TYPE feedback_status AS ENUM ('pending', 'reviewed', 'resolved', 'dismissed');

-- Integration Partners
CREATE TYPE integration_partner AS ENUM ('imagetrend', 'esos', 'zoll', 'emscloud', 'none');

-- User Roles & Tiers
CREATE TYPE user_role AS ENUM ('user', 'admin');
CREATE TYPE user_tier AS ENUM ('free', 'pro', 'enterprise');

-- Agency Types
CREATE TYPE agency_type AS ENUM ('fire_dept', 'ems_agency', 'hospital', 'state_office', 'regional_council');
CREATE TYPE subscription_tier AS ENUM ('starter', 'professional', 'enterprise');

-- Membership
CREATE TYPE member_role AS ENUM ('owner', 'admin', 'protocol_author', 'member');
CREATE TYPE member_status AS ENUM ('pending', 'active', 'suspended');

-- Protocol Status
CREATE TYPE protocol_status AS ENUM ('draft', 'review', 'approved', 'published', 'archived');
CREATE TYPE upload_status AS ENUM ('pending', 'processing', 'chunking', 'embedding', 'completed', 'failed');

-- Access Levels
CREATE TYPE access_level AS ENUM ('view', 'contribute', 'admin');

Relationships

Entity Relationship Diagram

--- JURISDICTION CHAIN (critical path for search) ---
counties (1) ───< (many) county_agency_mapping ────> (1) manus_agencies
manus_agencies (1) ───< (many) manus_protocol_chunks
manus_agencies (1) ───< (many) manus_agencies (self-ref: parent_protocol_source_id)

--- USER RELATIONSHIPS ---
manus_users (1) ────> (1) manus_agencies (selected_agency_id — jurisdiction preference)
manus_users (1) ───< (many) queries
manus_users (1) ───< (many) bookmarks
manus_users (1) ───< (many) feedback
manus_users (1) ───< (many) search_history
manus_users (1) ───< (many) user_counties ────> (1) counties
manus_users (1) ───< (many) user_states
manus_users (1) ───< (many) user_agencies ────> (1) agencies
manus_users (1) ───< (many) user_auth_providers
manus_users (1) ───< (many) agency_members ────> (1) agencies
manus_users (1) ───< (many) push_tokens
manus_users (1) ───< (many) analytics_events

--- B2B AGENCY MANAGEMENT ---
agencies (1) ───< (many) agency_members
agencies (1) ───< (many) protocol_versions
agencies (1) ───< (many) protocol_uploads
agencies (1) ───< (many) user_agencies

--- COUNTY RELATIONSHIPS ---
counties (1) ───< (many) county_agency_mapping
counties (1) ───< (many) protocol_chunks
counties (1) ───< (many) queries
counties (1) ───< (many) user_counties
counties (1) ───< (many) search_history

Indexes

Performance-Critical Indexes

Table Index Type Purpose
manus_protocol_chunks idx_manus_chunks_embedding_hnsw HNSW Vector similarity search
manus_protocol_chunks idx_manus_chunks_search_vector GIN Full-text search
manus_protocol_chunks idx_manus_chunks_state_agency B-tree State + agency filtering
manus_users idx_users_auth_id B-tree Auth lookup
agencies idx_agencies_slug B-tree URL routing
audit_logs idx_audit_logs_created B-tree Time-based queries

Index Count by Table

Table Index Count
manus_protocol_chunks 12+
manus_users 5+
agencies 3
agency_members 2
audit_logs 3
search_history 2

Row Level Security (RLS)

RLS Helper Functions

-- Get current user's internal ID from auth.uid()
CREATE FUNCTION get_current_user_id() RETURNS INTEGER;

-- Check if current user is admin
CREATE FUNCTION is_admin() RETURNS BOOLEAN;

-- Check if user is member of agency
CREATE FUNCTION is_agency_member(agency_id_param INTEGER) RETURNS BOOLEAN;

-- Check if user is agency admin/owner
CREATE FUNCTION is_agency_admin(agency_id_param INTEGER) RETURNS BOOLEAN;

RLS Policies Summary

Table Public Read User Self-Access Admin Access Service Role
manus_users No Yes Yes Full
agencies Yes - Yes Full
agency_members No Yes (own) Yes Full
queries No Yes Yes Full
bookmarks No Yes - Full
search_history No Yes - Full
feedback No Yes Yes Full
audit_logs No No Read-only Full
counties Yes - Yes Full
protocol_chunks Yes - - Full
contact_submissions No No Yes Full
integration_logs No No Read-only Full
stripe_webhook_events No No No Full
push_tokens No Yes - Full
drip_emails_sent No Read-only - Full

Key Security Principles

  1. User Isolation: Users can only access their own data
  2. Agency Scoping: Agency members access agency data per role
  3. Admin Elevation: Admins have elevated access where needed
  4. Service Role: Backend has full access for operations
  5. Public Safety: Medical protocols remain publicly accessible
  6. PHI Safeguards: No unauthorized PHI access

Database Functions

Search Functions

search_manus_protocols

Semantic protocol search with vector embeddings.

CREATE FUNCTION search_manus_protocols(
  query_embedding vector(1536),
  agency_filter integer DEFAULT NULL,
  state_filter text DEFAULT NULL,
  match_count integer DEFAULT 10,
  match_threshold float DEFAULT 0.3,
  agency_name_filter text DEFAULT NULL,
  state_code_filter char(2) DEFAULT NULL
) RETURNS TABLE (
  id integer,
  agency_id integer,
  protocol_number text,
  protocol_title text,
  section text,
  content text,
  image_urls text[],
  similarity float
);

search_manus_protocols_fts

Hybrid search combining vector similarity with full-text keyword relevance.

CREATE FUNCTION search_manus_protocols_fts(
    query_text TEXT,
    query_embedding vector(1536),
    agency_filter INTEGER DEFAULT NULL,
    state_code_filter TEXT DEFAULT NULL,
    match_count INTEGER DEFAULT 10,
    match_threshold FLOAT DEFAULT 0.3
) RETURNS TABLE (
    id INTEGER,
    agency_id INTEGER,
    protocol_number TEXT,
    protocol_title TEXT,
    section TEXT,
    content TEXT,
    image_urls TEXT[],
    similarity FLOAT,
    fts_rank FLOAT
);

Helper Functions

get_protocol_inheritance_chain

Returns protocol inheritance hierarchy: Agency → Regional → State.

CREATE FUNCTION get_protocol_inheritance_chain(agency_id_param INTEGER)
RETURNS TABLE (
    level INTEGER,
    id INTEGER,
    name TEXT,
    agency_type agency_type_enum,
    state_code CHAR(2)
);

update_updated_at_column

Auto-updates updated_at column on row changes.

CREATE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = CURRENT_TIMESTAMP;
    RETURN NEW;
END;
$$ language 'plpgsql';

Triggers

updated_at Auto-Update Triggers

Applied to tables with updated_at columns:

Table Trigger Name
manus_users update_users_updated_at
feedback update_feedback_updated_at
agencies update_agencies_updated_at
user_auth_providers update_user_auth_providers_updated_at
protocol_versions update_protocol_versions_updated_at

Full-Text Search Trigger

Auto-updates search_vector on manus_protocol_chunks:

CREATE TRIGGER manus_chunks_search_update
    BEFORE INSERT OR UPDATE OF protocol_title, section, content
    ON manus_protocol_chunks
    FOR EACH ROW
    EXECUTE FUNCTION manus_chunks_search_vector_trigger();

Protocol Count Sync Trigger

Keeps manus_agencies.protocol_count in sync:

CREATE TRIGGER trg_update_agency_protocol_count
    AFTER INSERT OR UPDATE OR DELETE ON manus_protocol_chunks
    FOR EACH ROW
    EXECUTE FUNCTION update_agency_protocol_count();

Vector/Embedding Storage

Embedding Model

  • Model: Gemini Embedding 2 Preview gemini-embedding-2-preview (Voyage AI removed 2026-03-24)
  • Dimensions: 1536
  • Similarity Metric: Cosine distance

Vector Index Configuration

-- HNSW index for fast approximate nearest neighbor search
CREATE INDEX idx_manus_chunks_embedding_hnsw 
ON manus_protocol_chunks 
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Index Parameters:

  • m = 16: Number of bi-directional links (default, balanced)
  • ef_construction = 64: Higher = better index quality, slower build
  • vector_cosine_ops: Optimized for cosine similarity

Vector Search Performance

  • Dataset Size: ~58K+ protocol chunks
  • Search Latency: <100ms for top-10 results
  • Index Size: ~200MB
  • Accuracy: 95%+ recall with HNSW

Hybrid Search Approach

The system uses a hybrid search strategy:

  1. Vector Similarity (70% weight): Semantic understanding of query intent
  2. Full-Text Search (30% weight): Keyword matching for precise terms
-- Combined score formula
ORDER BY (vector_similarity * 0.7) + (fts_rank * 0.3) DESC

Full-Text Search Configuration

-- Weighted tsvector: title (A) > section (B) > content (C)
search_vector = 
    setweight(to_tsvector('english', protocol_title), 'A') ||
    setweight(to_tsvector('english', section), 'B') ||
    setweight(to_tsvector('english', content), 'C')

Migration Files

Key migration files in /drizzle/migrations/:

Migration Description
0025_add_notification_tables.sql Push tokens, drip emails
0026_postgresql_updated_at_triggers.sql Auto-update triggers
0027_add_row_level_security_policies.sql Comprehensive RLS
0030_optimize_manus_protocol_chunks.sql Vector & FTS indexes
0032_add_waitlist_signups.sql Waitlist table

Related Documentation