Skip to content

Commit 16cf0c8

Browse files
Sbussisoclaude
andcommitted
sentinel: slice 2 — agent-ready dispatch (notification hook, manual run, agent endpoints)
Wires up everything that can be wired without the actual Sentinel agent service existing. The agent (slice 3) will plug into these endpoints unchanged when it ships. Backend: 1. SentinelRun gains state machine columns (added via sync_schema): - outcome string widens to: pending | running | incident | no_action | error - started_at, completed_at: nullable timestamps for the agent's state transitions - manual_prompt: nullable text for trigger_type=manual runs (the operator's prompt from the Run-Now modal) - is_terminal property: convenience for "has this run completed?" 2. New module app/core/sentinel_dispatch.py with the single dispatch gate that every code path funnels through: - _is_camera_in_scope: cameras absent from the scope dict default to in-scope (matches frontend isCameraInScope behaviour) - _schedule_allows_now: parses schedule_mode + schedule_start/end + active_days against the org's IANA timezone (Setting key "timezone", defaults to UTC); handles wrap-around windows like 22:00 → 06:00 - cap enforcement: 300 runs / month, hard-coded for now (slice 5 surfaces it as a per-plan setting) - maybe_dispatch_for_notification: best-effort, never raises — a failed dispatch must NEVER block the underlying notification from being delivered - dispatch_manual_run: skips schedule + scope checks (the operator overrode them by clicking) but still cap-enforces 3. notifications.create_notification() now calls maybe_dispatch_for_notification() after the email side-channel. When motion / incident_created fires AND Sentinel is enabled with that trigger on AND the camera is in scope AND the schedule allows it AND we're under the cap, a pending sentinel_runs row gets inserted. 4. Four new endpoints in app/api/sentinel.py: - POST /api/sentinel/runs/manual: operator-initiated run from the "Run now" button. Pro Plus only; cap-enforced; audited. - POST /api/sentinel/runs/{id}/start: agent claims a pending run (transitions outcome=running). Idempotent. Auth: agent key. - POST /api/sentinel/runs/{id}/complete: agent reports terminal outcome + tool trace. Idempotent (re-call with same body is a no-op once terminal). Auth: agent key. - GET /api/sentinel/runs/pending: cross-org polling endpoint for the agent to discover work, FIFO, surfaces org_id so the agent knows which MCP key to use. Auth: agent key. /runs/pending is registered BEFORE /runs/{run_id} because FastAPI matches routes in registration order; otherwise the literal path would 404 with run_id="pending". 5. New auth scheme: SENTINEL_AGENT_KEY env var. Service-to-service secret in the X-Sentinel-Agent-Key header. Hard-rejects every request when not configured, which is the right behaviour in environments where the agent isn't deployed yet (current state). 6. /runs response stats expanded: pending count, runs_this_month, monthly_cap, remaining_this_month. Frontend: 1. New api.js helper: dispatchSentinelManualRun(getToken, {prompt, cameraId}). POSTs to /runs/manual. 2. SentinelPage.jsx: - Run-Now button enabled when interactive (not loading, not plan-gated). Click opens the manual-run modal (re-added). Submit POSTs to /manual, prepends the new pending row to the timeline, optimistically bumps the stats counters, toasts. - 429 cap-reached response surfaced as an explicit toast. - Outcome chip helpers + outcomeDotClass extended to handle pending and running states. - Timeline + history-table runs render placeholder italic text ("Agent is investigating…", "Waiting for the agent to pick this up") for pending/running rows that don't have a real summary yet. - Drawer surfaces the operator's manual_prompt when present, plus dedicated outcome blurbs for pending/running. 3. CSS: - .sentinel-outcome-chip-pending / -running (blue family) - .sentinel-timeline-dot-pending / -running with the existing pulse keyframe in blue — visually communicates "the agent should pick this up imminently" - .sentinel-timeline-summary-muted: italic muted style for the placeholder lines Verified locally: - Full backend test suite (549 tests) green - All 8 sentinel routes registered, /runs/pending matches before /runs/{run_id} - sync_schema added the 3 new columns to existing sentinel_runs table on first boot - Ruff lint clean (after auto-fix of import sort + the manual ones from the prior commit) What's still deferred to slice 3: - Actually building the agent service (separate Fly app) - Choosing polling vs. webhook delivery (both contracts are now in place server-side; agent picks one) - LLM inference + MCP tool calls - Real reasoning summary + tool_trace populated by the agent Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 5b6e855 commit 16cf0c8

8 files changed

Lines changed: 789 additions & 38 deletions

File tree

backend/app/api/notifications.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -594,6 +594,23 @@ def create_notification(
594594
kind,
595595
)
596596

597+
# Sentinel agent dispatch — see app/core/sentinel_dispatch.py.
598+
# The dispatcher gates on Sentinel config (enabled, trigger
599+
# toggles, camera scope, schedule window, monthly cap) and
600+
# creates a pending sentinel_runs row when the gate clears.
601+
# Best-effort and self-contained: a dispatch failure never
602+
# blocks the underlying notification from being delivered.
603+
try:
604+
from app.core.sentinel_dispatch import maybe_dispatch_for_notification
605+
maybe_dispatch_for_notification(
606+
session, org_id=org_id, kind=kind, camera_id=camera_id,
607+
)
608+
except Exception:
609+
logger.exception(
610+
"[Notifications] sentinel dispatch failed for kind=%s",
611+
kind,
612+
)
613+
597614
return notif
598615
except Exception:
599616
logger.exception("[Notifications] Failed to create notification")

backend/app/api/sentinel.py

Lines changed: 223 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,20 @@
3131
from datetime import UTC, datetime
3232
from typing import Optional
3333

34-
from fastapi import APIRouter, Depends, HTTPException, Query, Request
34+
from fastapi import APIRouter, Depends, Header, HTTPException, Query, Request
3535
from pydantic import BaseModel, Field
3636
from sqlalchemy.orm import Session
3737

3838
from app.core.audit import write_audit
3939
from app.core.auth import AuthUser, require_admin, require_view
40+
from app.core.config import settings
4041
from app.core.database import get_db
4142
from app.core.plans import effective_plan_for_caps, get_plan_display_name
43+
from app.core.sentinel_dispatch import (
44+
MONTHLY_RUN_CAP,
45+
dispatch_manual_run,
46+
runs_used_this_month,
47+
)
4248
from app.models.models import SentinelConfig, SentinelRun
4349

4450
logger = logging.getLogger(__name__)
@@ -102,6 +108,33 @@ def _validate_hhmm(value: str, field_name: str) -> None:
102108
raise HTTPException(400, f"{field_name} out of range")
103109

104110

111+
# ── Service-to-service auth (Sentinel agent → Command Center) ───────
112+
# Agent posts run completions back via this header. Defined BEFORE
113+
# any route uses it via Depends() so module-load order works out.
114+
async def require_sentinel_agent(
115+
x_sentinel_agent_key: Optional[str] = Header(None, alias="X-Sentinel-Agent-Key"),
116+
) -> None:
117+
"""Verify the inbound request carries the shared SENTINEL_AGENT_KEY
118+
secret. Used only for service-to-service callbacks from the
119+
Sentinel agent into Command Center (run-completion + pending-run
120+
polling).
121+
122+
Org scope is established by the run row's org_id, not by this
123+
auth — the agent is org-agnostic at the auth layer. Each run
124+
record is org-scoped server-side, so a leaked agent key can only
125+
update runs that already exist (it can't fabricate a run for a
126+
different org).
127+
128+
Hard-rejects every request when the key isn't configured (empty
129+
string), which is the desired behaviour in environments where
130+
the agent isn't deployed.
131+
"""
132+
if not settings.SENTINEL_AGENT_KEY:
133+
raise HTTPException(401, "agent auth not configured")
134+
if not x_sentinel_agent_key or x_sentinel_agent_key != settings.SENTINEL_AGENT_KEY:
135+
raise HTTPException(401, "invalid agent key")
136+
137+
105138
# ── GET /api/sentinel/config ────────────────────────────────────────
106139
@router.get("/config")
107140
async def get_config(
@@ -235,8 +268,7 @@ async def list_runs(
235268
.all()
236269
)
237270

238-
# Small inline stats — avoids a separate /usage endpoint while
239-
# the cap framework isn't built yet (slice 5).
271+
# Small inline stats.
240272
today_start = datetime.now(tz=UTC).replace(
241273
hour=0, minute=0, second=0, microsecond=0, tzinfo=None
242274
)
@@ -245,18 +277,63 @@ async def list_runs(
245277
)
246278

247279
incident_count = base.filter(SentinelRun.outcome == "incident").count()
280+
pending_count = base.filter(SentinelRun.outcome.in_(("pending", "running"))).count()
281+
runs_month = runs_used_this_month(db, user.org_id)
248282

249283
return {
250284
"runs": [r.to_dict(include_trace=False) for r in rows],
251285
"total": total,
252286
"stats": {
253287
"runs_today": runs_today,
254288
"runs_total": base.count(),
289+
"runs_this_month": runs_month,
255290
"incidents_filed": incident_count,
291+
"pending": pending_count,
292+
"monthly_cap": MONTHLY_RUN_CAP,
293+
"remaining_this_month": max(0, MONTHLY_RUN_CAP - runs_month),
256294
},
257295
}
258296

259297

298+
# ── GET /api/sentinel/runs/pending (agent → CC) ─────────────────────
299+
# REGISTERED BEFORE /runs/{run_id} so the literal "pending" path
300+
# wins over the parameterised one (FastAPI matches in registration
301+
# order; otherwise GET /runs/pending would 404 with run_id=pending).
302+
@router.get("/runs/pending", dependencies=[Depends(require_sentinel_agent)])
303+
async def list_pending_runs(
304+
limit: int = Query(20, ge=1, le=100),
305+
db: Session = Depends(get_db),
306+
):
307+
"""Polling endpoint for the Sentinel agent to discover work.
308+
309+
Returns up to `limit` pending runs across all orgs, oldest-first
310+
(FIFO). The agent is responsible for calling /start on each one
311+
it picks up so others don't race for the same row.
312+
313+
Slice 3 may swap this for a webhook delivery model — both flows
314+
are agent-side concerns; the run record contract stays the same.
315+
"""
316+
rows = (
317+
db.query(SentinelRun)
318+
.filter(SentinelRun.outcome == "pending")
319+
.order_by(SentinelRun.triggered_at.asc())
320+
.limit(limit)
321+
.all()
322+
)
323+
return {
324+
"runs": [
325+
{
326+
**r.to_dict(include_trace=False),
327+
# Agent needs the org_id to know which MCP key to use
328+
# — surfaced explicitly because to_dict() doesn't
329+
# include it (UI doesn't need it).
330+
"org_id": r.org_id,
331+
}
332+
for r in rows
333+
],
334+
}
335+
336+
260337
# ── GET /api/sentinel/runs/{run_id} ─────────────────────────────────
261338
@router.get("/runs/{run_id}")
262339
async def get_run(
@@ -273,3 +350,146 @@ async def get_run(
273350
if row is None:
274351
raise HTTPException(404, "run not found")
275352
return row.to_dict(include_trace=True)
353+
354+
355+
# ── POST /api/sentinel/runs/manual ──────────────────────────────────
356+
class ManualRunBody(BaseModel):
357+
prompt: str = Field("", max_length=2000)
358+
camera_id: Optional[str] = None
359+
360+
361+
@router.post("/runs/manual")
362+
async def post_manual_run(
363+
body: ManualRunBody,
364+
request: Request,
365+
user: AuthUser = Depends(require_admin),
366+
db: Session = Depends(get_db),
367+
):
368+
"""Operator-initiated agent run. Creates a pending sentinel_runs
369+
row that the agent picks up when it ships (slice 3); meanwhile
370+
rows queue and the UI shows pending state.
371+
372+
Pro Plus only. Cap-enforced. Schedule + scope are deliberately
373+
NOT enforced — the operator clicked "Run now" to override them.
374+
"""
375+
if effective_plan_for_caps(db, user.org_id) != "pro_plus":
376+
raise HTTPException(
377+
status_code=402,
378+
detail={"error": "plan_required", "plan": "pro_plus"},
379+
)
380+
381+
try:
382+
run = dispatch_manual_run(
383+
db,
384+
org_id=user.org_id,
385+
prompt=body.prompt,
386+
camera_id=body.camera_id,
387+
)
388+
except ValueError as exc:
389+
if str(exc) == "monthly_cap_reached":
390+
raise HTTPException(
391+
status_code=429,
392+
detail={
393+
"error": "monthly_cap_reached",
394+
"cap": MONTHLY_RUN_CAP,
395+
"used": runs_used_this_month(db, user.org_id),
396+
},
397+
) from exc
398+
raise
399+
400+
write_audit(
401+
db,
402+
org_id=user.org_id,
403+
event="sentinel_manual_run",
404+
user_id=user.user_id,
405+
username=user.email or user.username,
406+
details=f"run_id={run.id} camera={body.camera_id or '-'} prompt_len={len(body.prompt or '')}",
407+
request=request,
408+
)
409+
return run.to_dict(include_trace=False)
410+
411+
412+
# ── POST /api/sentinel/runs/{id}/complete (agent → CC) ──────────────
413+
class RunCompleteBody(BaseModel):
414+
outcome: str # incident | no_action | error
415+
severity: Optional[str] = None # low | medium | high (only when outcome=incident)
416+
incident_id: Optional[int] = None
417+
summary: str = Field("", max_length=8000)
418+
tool_call_count: int = 0
419+
tool_trace: Optional[list[dict]] = None
420+
421+
422+
_VALID_TERMINAL_OUTCOMES = {"incident", "no_action", "error"}
423+
424+
425+
@router.post("/runs/{run_id}/complete", dependencies=[Depends(require_sentinel_agent)])
426+
async def post_run_complete(
427+
run_id: str,
428+
body: RunCompleteBody,
429+
db: Session = Depends(get_db),
430+
):
431+
"""Agent → Command Center callback to mark a pending/running run
432+
as completed. Idempotent: safe to call twice with the same body
433+
(second call is a no-op if the run is already terminal).
434+
"""
435+
if body.outcome not in _VALID_TERMINAL_OUTCOMES:
436+
raise HTTPException(400, f"invalid outcome: {body.outcome!r}")
437+
if body.outcome == "incident" and body.severity not in ("low", "medium", "high"):
438+
raise HTTPException(400, "severity required for outcome=incident")
439+
440+
row = db.query(SentinelRun).filter_by(id=run_id).first()
441+
if row is None:
442+
raise HTTPException(404, "run not found")
443+
444+
if row.is_terminal:
445+
# Idempotent no-op — agent retried.
446+
return row.to_dict(include_trace=True)
447+
448+
now = datetime.now(tz=UTC).replace(tzinfo=None)
449+
row.outcome = body.outcome
450+
row.severity = body.severity if body.outcome == "incident" else None
451+
row.incident_id = body.incident_id if body.outcome == "incident" else None
452+
row.summary = (body.summary or "")[:8000]
453+
row.tool_call_count = max(0, int(body.tool_call_count or 0))
454+
if body.tool_trace is not None:
455+
row.set_tool_trace(body.tool_trace)
456+
if row.started_at is None:
457+
# Agent went straight to terminal without an explicit start
458+
# signal — best-effort backfill.
459+
row.started_at = now
460+
row.completed_at = now
461+
462+
db.commit()
463+
db.refresh(row)
464+
logger.info(
465+
"sentinel: run completed id=%s org=%s outcome=%s severity=%s",
466+
row.id, row.org_id, row.outcome, row.severity,
467+
)
468+
return row.to_dict(include_trace=True)
469+
470+
471+
# ── POST /api/sentinel/runs/{id}/start (agent → CC) ─────────────────
472+
@router.post("/runs/{run_id}/start", dependencies=[Depends(require_sentinel_agent)])
473+
async def post_run_start(
474+
run_id: str,
475+
db: Session = Depends(get_db),
476+
):
477+
"""Agent claims a pending run and transitions it to running.
478+
Optional — the agent may skip this and jump straight to /complete
479+
if it doesn't need a separate "I'm working on it" signal.
480+
"""
481+
row = db.query(SentinelRun).filter_by(id=run_id).first()
482+
if row is None:
483+
raise HTTPException(404, "run not found")
484+
if row.outcome != "pending":
485+
# Already past pending — accept idempotently.
486+
return row.to_dict(include_trace=False)
487+
row.outcome = "running"
488+
row.started_at = datetime.now(tz=UTC).replace(tzinfo=None)
489+
db.commit()
490+
db.refresh(row)
491+
return row.to_dict(include_trace=False)
492+
493+
494+
# /runs/pending lives above (registered BEFORE /runs/{run_id} due to
495+
# FastAPI's in-order route matching).

backend/app/core/config.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,5 +138,17 @@ def is_email_configured(cls) -> bool:
138138
"""
139139
return bool(cls.RESEND_API_KEY and cls.EMAIL_FROM_ADDRESS)
140140

141+
# ── Sentinel agent service-to-service auth ────────────────────────
142+
# The Sentinel agent (separate Fly app, ships in slice 3) calls
143+
# back to /api/sentinel/runs/{id}/complete to update pending runs
144+
# with their final outcome + tool trace. Authenticated via this
145+
# shared secret in the X-Sentinel-Agent-Key header.
146+
#
147+
# Set via Fly secret in production; left blank in local dev so the
148+
# endpoint hard-rejects everything until a key is configured.
149+
# Org scope comes from the run row itself (the URL path), not the
150+
# auth — the agent is org-agnostic at the auth layer.
151+
SENTINEL_AGENT_KEY: str = os.getenv("SENTINEL_AGENT_KEY", "")
152+
141153

142154
settings = Config()

0 commit comments

Comments
 (0)