feat(db): discovery_snapshots table — daily Discovery-Tracking storage (P2 sub-step)#55
Merged
Merged
Conversation
Migration per Discovery-Tracking-Baseline SPEC (PR #54 merged 2026-05-20) §3.5 + §4 + §6 P2. Schema: - BIGSERIAL primary key - DATE-unique (one row per day; UPSERT-safe) - JSONB payload (schema-flexible — V1 fields can grow without ALTER TABLE) - source_run_status enum (ok/partial/failed) for daily-cron-health visibility Idempotent (IF NOT EXISTS on table + index). Baseline row for 2026-05-21 will be INSERTed manually post-migration-apply (SPEC §3.6 — Pflicht-Termin heute Abend für Delta-Messbarkeit morgen). Cron-Script (scripts/discovery_snapshot.py) folgt in P3.
5 tasks
HaraldeRoessler
pushed a commit
to HaraldeRoessler/moltrust-api
that referenced
this pull request
May 22, 2026
Discovery-Tracking P3.1 per SPEC docs/specs/2026-05-21_discovery-tracking- baseline-SPEC.md §3.5 + §5.2. Self-contained daily cron script. Captures 5 sources into the discovery_snapshots table (migration in PR MoltyCel#55): - self_probes : GET 4 Discovery surfaces (sitemap.xml URL-count, llms.txt MoltGuard-block, /guard/openapi.json path-count, /extendedAgentCard MoltGuard-extensions) - bot_hits : parse /var/log/nginx/access.log* (last 7d), bot-UA × endpoint-class. moltstack is in `adm` group → cron reads logs without sudo. Privacy §3.7: no IPs persisted, only UA-counts. - github : GH_TOKEN-authenticated repo + traffic API, 6 MoltyCel repos. Graceful "pat-not-configured" if GH_TOKEN absent. - gsc : manual-pending (V0 per §9.1). - errors : non-fatal failures collected; source_run_status ok/partial/failed computed accordingly. Idempotenz: UPSERT ON CONFLICT (snapshot_at) DO UPDATE — repeated same-day runs refresh the row, never create a 2nd. DB literal is dollar-quoted ($disco$) — injection-safe without escaping. Alerts: Telegram on partial/failed status (TELEGRAM_BOT_TOKEN/CHAT_ID from ~/.moltrust_secrets). Flags: - --dry-run assemble + print, no DB write - --date YYYY-MM-DD override snapshot_at (backfill / throwaway test) Test-Run verified 2026-05-21 against throwaway date 2099-12-31: 4/4 probes, 16 bots / 1664 hits, 6/6 GitHub repos, upsert ok, throwaway row deleted, baseline 2026-05-21 untouched. Crontab entry (server-side, NOT repo-managed per CLAUDE.md §Geltungsbereich — applied manually post-merge with audit note): 30 0 * * * set -a && source /home/moltstack/.moltrust_secrets && set +a \ && cd /home/moltstack/moltstack \ && /home/moltstack/moltstack/venv/bin/python scripts/discovery_snapshot.py \ >> logs/discovery_snapshot.log 2>&1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
P2 Sub-Step (a) per Discovery-Tracking-Baseline SPEC (PR #54 merged): neue Tabelle
discovery_snapshotsfür tägliche Discovery-Surface-Snapshots.Klein + fokussiert — nur CREATE TABLE + INDEX + COMMENTs. Kein Code-Change. Idempotent (IF NOT EXISTS).
Schema
JSONB-payload per SPEC §3.5 (5 Sub-Sections: self_probes, gsc, bot_hits, github, errors).
Pre-Commit-Diff (§8)
Genau 1 File, etabliertes
migrations/YYYY-MM-DD_<desc>.sql-Schema, kein Fremd-Scope.Branch-Hygiene (§11.4)
Branch ab frischem
origin/main(cd1b0e5, 0 behind), Worktree~/moltrust-api-I.§2.3 Cross-Review
Skip — reine Schema-Definition, kein Auth-/Credential-/Token-Pfad. Tabelle ist Container für eigene aggregierte Metriken.
Test plan
psql -d moltstack -f migrations/2026-05-21_create_discovery_snapshots.sql\d discovery_snapshotszeigt 5 Spalten + Index + Comments