Skip to content

feat: PostgreSQL push sync and read-only server mode#167

Merged
wesm merged 35 commits intowesm:mainfrom
tlmaloney:feat/pg-push
Mar 18, 2026
Merged

feat: PostgreSQL push sync and read-only server mode#167
wesm merged 35 commits intowesm:mainfrom
tlmaloney:feat/pg-push

Conversation

@tlmaloney
Copy link
Contributor

@tlmaloney tlmaloney commented Mar 13, 2026

Summary

Adds PostgreSQL support as an optional remote backend for multi-machine shared access to session data. Three new commands under pg:

  • pg push — one-way sync from local SQLite to PostgreSQL. Incremental by default (only sessions modified since last push), with --full to re-push everything. Batched transactions (50 sessions/tx) with per-session retry fallback on batch failure. Multi-row INSERT for messages and tool calls. Fingerprint-based change detection skips unchanged sessions on incremental pushes.
  • pg serve — read-only HTTP server backed by PostgreSQL instead of SQLite. Same API and frontend, no local SQLite dependency. Write operations (stars, pins, rename, etc.) return ErrReadOnly; the frontend adapts accordingly.
  • pg status — show sync state (last push time, session/message counts).

Implementation details

  • New internal/postgres/ package: connection management, SSL enforcement, schema DDL with version tracking, push sync, read-only store with sessions/messages/search/analytics
  • db.Store interface extracted from db.DB so the HTTP server works identically against SQLite or PostgreSQL
  • Config migrated from JSON to TOML (~/.agentsview/config.toml), with automatic migration of existing JSON configs
  • SSL enforced for non-loopback PG connections by default; allow_insecure = true under [pg] overrides with a warning
  • All PG connections force TimeZone=UTC for consistent date filtering
  • Coherence check on push: detects when PG schema was dropped externally (local watermark set but PG has 0 sessions) and auto-forces full re-push with schema recreation
  • UTF-8 sanitization before PG insert (null bytes, truncated multi-byte sequences stripped from content fields)
  • Rune-aware string truncation in parsers to avoid splitting multi-byte characters
  • --base-path flag for reverse-proxy subpath deployment with <base href> injection
  • public_origins config expands the host allowlist for trusted reverse-proxy origins
  • Docker-compose infrastructure for PG integration tests, with CI integration via GitHub Actions service container

Not in scope (follow-up issues)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (5c83d38)

Summary Verdict: This PR successfully introduces push-only
PostgreSQL sync, a PostgreSQL-backed read-only server mode, and a machine filter, but requires several fixes related to data synchronization integrity, query semantics, and API response formatting to ensure parity with the existing SQLite implementation.

Medium Severity

1. Weak Change-Detection Heuristic in Sync

  • Files:
    internal/pgsync/push.go (line 360)
  • Description: The pushMessages change-detection heuristic treats a session as unchanged when the message count, tool-call count, and length aggregates (sum, max, min of content length) match. Rewriting a message
    to different text of the same length, or changing tool metadata/input with unchanged lengths, will silently skip updates and leave stale or potentially sensitive redacted content in the remote database.
  • Recommendation: Replace the length-based heuristic with a stable content digest (e.g., a per-message/tool hash or session
    -level cryptographic hash), or force a full message re-push whenever the parsed file modification time/hash changes.

2. Incomplete Date Handling in PG Sync

  • Files: internal/pgsync/schema.go, internal/pgsync/push.go, internal/pg db/sessions.go, internal/pgdb/analytics.go
  • Description: The PostgreSQL path drops created_at entirely and uses started_at as the only fallback date. Existing SQLite queries consistently use COALESCE(started_at, created_at). After
    sync, sessions with no StartedAt will sort incorrectly, fail date filters, and disappear from analytics in PG read-only mode.
  • Recommendation: Store and sync created_at in PostgreSQL, and use the same COALESCE(started_at, created_at) semantics as SQLite
    . Add an integration test for sessions missing StartedAt.

3. Broken Phrase Search in PG Read-Only Mode

  • Files: internal/pgdb/messages.go
  • Description: Multi-word search is broken in PG read-only mode. The server quote-wra
    ps phrase searches for SQLite FTS, but the PG backend performs a literal ILIKE '%' || $1 || '%'. A query like hello world reaches PG as "hello world", which will not match normal message text unless the quotes are literally present in the content.
  • Recommendation: Bypass
    prepareFTSQuery for the PostgreSQL backend or strip the layer of surrounding quotes inside pgdb.Search. Add a test covering phrase search against PG.

4. Null Slice Serialization Crashing Frontend UI

  • Files: internal/pgdb/sessions.go (lines ~434, ~
    480), internal/pgdb/messages.go (lines ~102, ~173)
  • Description: Query methods like GetMachines, GetProjects, GetMinimapFrom, and Search declare their return slices as var slice []Type. If
    there are no rows, these remain nil and serialize to JSON as null, which causes frontend UI crashes when properties like .length are evaluated.
  • Recommendation: Initialize these slices with empty literals (e.g., machines := []string{}) to ensure they always serialize to JSON arrays (
    []).

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tlmaloney
Copy link
Contributor Author

Picking up the conversation from #108, I asked Claude to compare against roborev:

Here's a comparison of the two implementations:

## Architecture

| Aspect | agentsview | roborev |
|--------|-----------|---------|
| **Direction** | Push-only (SQLite → PG) | Bidirectional (push + pull) |
| **Read-only PG server** | Yes (--pg-read flag) | No — all access through daemon |
| **PG schema** | agentsview.* | roborev.* |
| **Connection** | database/sql + pgx/stdlib | pgxpool (native pgx pool) |

## Sync Strategy

| Aspect | agentsview | roborev |
|--------|-----------|---------|
| **Change detection** | Watermark timestamp (last_push_at) + content fingerprint heuristic | synced_at vs updated_at per-row |
| **Push granularity** | Full session replacement (delete-then-insert messages/tool_calls) | Row-level upsert with ON CONFLICT DO UPDATE |
| **Batching** | 1 session at a time, all its messages/tool_calls in one tx | 25 items per batch, loops until empty |
| **Default interval** | 15 min | 1 hour |
| **Shutdown behavior** | No final push | Final push loop before exit |

## Schema & PKs

| Aspect | agentsview | roborev |
|--------|-----------|---------|
| **Session PK** | id TEXT (simple) | uuid UUID (unique) + serial id |
| **Machine tracking** | machine column on sessions (metadata only) | source_machine_id on every entity, used for push/pull filtering |
| **Schema migrations** | Advisory lock + versioned DDL | schema_version table, 7 versions, non-destructive |

## Multi-Machine

| Aspect | agentsview | roborev |
|--------|-----------|---------|
| **Pull support** | None — uses read-only PG server mode instead | Full pull with cursor-based pagination |
| **Conflict resolution** | Last-write-wins (upsert whole session) | Last-write-wins with per-field COALESCE |
| **Cross-machine visibility** | Via the PG-backed web UI | Via pull into each machine's local SQLite |
| **Echo prevention** | Not needed (push-only) | source_machine_id != local_machine_id filter on pull |

## Key Differences in Approach

1. **roborev is truly bidirectional** — each machine pulls others' data into its own SQLite, so every machine has a complete local copy. agentsview takes a simpler approach: push to PG, then view PG data through a read-only web server.

2. **roborev tracks sync state per-row** (synced_at on each job/review/response). agentsview uses a single watermark timestamp and content fingerprints to detect changes — simpler but less precise (can miss changes if timestamps aren't monotonic).

3. **roborev does row-level upserts** with ON CONFLICT DO UPDATE preserving specific fields via COALESCE. agentsview does delete-then-insert for messages and tool_calls — simpler but less efficient for large sessions with small changes.

4. **roborev has a final push on shutdown** ensuring nothing is lost. agentsview doesn't — if the process dies between syncs, recent changes wait for the next push.

5. **roborev's test suite is more comprehensive** for multi-machine scenarios (multiplayer, offline/reconnect, same-commit contention). agentsview's PG tests focus on single-machine push correctness.

### What agentsview could adopt from roborev:
- **Per-row sync tracking** (synced_at) for more precise change detection
- **Final push on shutdown** to avoid data loss
- **Connection pooling** via pgxpool instead of database/sql
- **Exponential backoff** on connection failures (roborev does 1s → 5m)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (7f85f09)

Summary: The PR introduces several critical issues in
the new PostgreSQL paths, including a build break, broken pagination, a SQL scanning error, and potential credential leakage risks.

High

  • db.Store Interface Mismatch
    The new db.Store interface is not wired through consistently, which appears to introduce a build break.
    internal/db/store.go
    adds GetStats/GetProjects/GetAgents/GetMachines methods that take excludeOneShot bool, and internal/server /server.go switches Server.db to db.Store. However, the server handlers still call the old one-argument forms in internal/server/events .go. The same mismatch exists with the var _ Store = (*DB)(nil) assertion versus the unchanged *DB method signatures.
    Suggested fix: Keep the store
    interface signatures aligned with existing callers, or plumb the extra boolean through every implementation and call site.
  • Pagination Bug in PG Sessions
    In [internal/pgdb/sessions.go:348](/home/roborev/.roborev/clones/wesm/agentsview
    /internal/pgdb/sessions.go#L348), ListSessions orders and paginates by a fallback expression (COALESCE(NULLIF(ended_at,''), NULLIF(started_at,''), created_at)), but when building NextCursor it initializes ea to "" instead of last.CreatedAt. If a session lacks both StartedAt and EndedAt, the cursor value remains empty. Subsequent paginated queries will evaluate COALESCE(..., created_at) < "" and return 0 rows, prematurely ending pagination.
    Suggested
    fix:
    Build the cursor from the exact same fallback expression used in ORDER BY (e.g., change ea := "" to ea := last.CreatedAt).
  • Unsafe Scan of Nullable Timestamp
    In [internal/pgdb/analytics.go:1210](/home/
    roborev/.roborev/clones/wesm/agentsview/internal/pgdb/analytics.go#L1210) (queryVelocityMsgs), the SQL query selects timestamp (a nullable column) without COALESCE, but rows.Scan attempts to
    map it into a primitive string variable. If any message has a NULL timestamp, scanning will fail with a conversion error and break the velocity analytics endpoint.
    Suggested fix: Change timestamp to COALESCE(timestamp, '') in the SELECT clause, matching the pattern
    used in GetMessages.

Medium

  • Credential Leakage via Connection Errors
    Unredacted PostgreSQL connection errors can leak credentials into logs and CLI output. The new PG connect paths wrap and surface raw driver errors from sql.Open/Ping without redaction. If the DSN
    contains secrets, parse/auth/connect failures can expose them.
    Suggested fix: Redact DSNs before wrapping/logging connection errors.
    Locations:
    • [internal/pgdb/pgdb.go:17](/home/roborev/.roborev/
      clones/wesm/agentsview/internal/pgdb/pgdb.go#L17)
    • [internal/pgsync/pgsync.go:51](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync
      /pgsync.go#L51)
    • cmd/agentsview/main.go:686

cmd/agentsview/sync.go:141

  • Incorrect Analytics Under Time Filters
    In internal/pg db/analytics.go, GetAnalyticsTopSessions can return incorrect results when hour/day filters are active. The query applies LIMIT 200 before
    the later client-side timeIDs/local-date filtering. If many of the top 200 rows are filtered out afterwards, matching sessions beyond 200 are never considered, resulting in an incomplete or incorrect "top 10".
    Suggested fix: Push the time-of-day filter into
    SQL, or remove the SQL limit and apply limiting only after the client-side filtering/sort.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (ca93484)

Summary: This PR adds PostgreSQL push sync and a read-only server mode, but introduces several medium-severity logic and security issues in the new paths.

Medium

  • **Credential leakage in PostgreSQL error paths
    **
    Files: internal/pgdb/pgdb.go:18, internal/pgdb/pgdb.go:2 8, [internal/pgsync/pgsync.go:16](/home/roborev/.roborev/clones/
    wesm/agentsview/internal/pgsync/pgsync.go#L16), [internal/pgsync/pgsync.go:55](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/pgsync.go#
    L55)
    Description: The new redactDSN() helpers rely on url.URL.Redacted(), which only masks credentials in the URL authority, not in query parameters. DSNs like postgres://...?...&password=${PGPASS} will expose the database password in terminal
    output or collected logs upon connection failure.
    Suggested Fix: Do not include the DSN in error messages, or explicitly scrub sensitive query parameters (e.g., password, passfile) before logging.

  • Status check mutates database schema
    File: cmd/agents view/sync.go
    Description: runPGSync() makes -pg-status a mutating operation by unconditionally calling ps.EnsureSchema(ctx ) before checking the command mode. This causes agentsview sync -pg-status to fail for read-only credentials and creates schema objects as a side effect.
    Suggested Fix: Only run EnsureSchema on the -pg push path. Make the status path tolerate an uninitialized schema by
    reporting "not initialized" or "never"/0/0.

  • Flawed skip heuristic in push sync
    File: [internal/pgsync/push.go](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/
    push.go)
    Description: pushMessages() skips updates based only on message count, content lengths, and tool-call counts. This misses real changes when content is rewritten without changing lengths, or when metadata-only tool-call changes occur (e.g., tool_name, input_ json).
    Suggested Fix: Compare a stable hash/fingerprint over actual message content plus tool-call metadata, or fall back to a full re-push whenever the session fingerprint changes.

  • Read-only mode skips standard server startup
    File: cmd/agentsview/main .go
    Description: runServePGRead() routes PGReadURL through a separate path that returns before the normal serve startup flow, starting the
    server directly with http.ListenAndServe. This drops normal browser-open behavior and bypasses shared listener/proxy setup.
    Suggested Fix: Factor the common HTTP/proxy/browser startup into one shared path and use it for both SQLite and PG read-only modes.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (9164577)

Verdict: The PR implements core PostgreSQL sync
and read-only server functionality, but requires fixes for several medium-severity issues spanning read-only enforcement bypasses, missing query parity, and incomplete state synchronization.

Medium

1. PG read-only mode bypasses normal serve bootstrap

  • Location: cmd/agentsview/main.go#L
    183
    , [main.go#L235](/home/roborev/.roborev/clones/wesm/agentsview/cmd
    /agentsview/main.go#L235), main.go#L342, main.go#L68
    5
  • Description: The early return in main.go jumps straight into runServePGRead, which means the later remote-
    access setup and managed Caddy startup never run for -pg-read. In practice, remote_access: true will not auto-rebind or provision auth, and proxy.mode: caddy will validate but never actually start Caddy.
  • Suggested Fix: Route PG-backed serving
    through the same common startup path as SQLite, or duplicate the remote/proxy bootstrap inside runServePGRead. Add tests for pg-read + remote_access and pg-read + proxy.mode=caddy.

2. PG session list misses IncludeChildren parity

Location: internal/pgdb/sessions.go#L78, [internal/db/sessions.go#L299](/home
/roborev/.roborev/clones/wesm/agentsview/internal/db/sessions.go#L299)

  • Description: The PostgreSQL session list does not implement IncludeChildren, so PG read mode changes sidebar/session filtering semantics versus SQLite. The PG path hard
    codes root-only predicates, while the SQLite path has explicit IncludeChildren logic. Since the frontend always requests include_children=true, subagent/fork sessions will disappear in PG mode and grouped ancestry behavior will regress.
  • Suggested Fix: Port the SQLite IncludeChildren query semantics to
    PGDB.ListSessions and add parity tests mirroring the existing SQLite filter tests.

3. Incomplete read-only enforcement for session uploads

  • Location: [internal/server/upload.go:174](/home/roborev/.roborev/clones/wesm
    /agentsview/internal/server/upload.go), internal/server/upload.go:201
  • Description: The new PostgreSQL read-only mode
    does not actually make POST /api/v1/sessions/upload read-only. The handler saves the attacker-controlled file to disk and parses it before it ever hits the db.ErrReadOnly check. In read-only PG mode, an authenticated remote client can still create or overwrite files under
    ~/.agentsview/uploads/<project>/<filename>, breaking the intended authorization boundary for read-only deployments and potentially allowing for persistent file drop or disk consumption.
  • Suggested Fix: Reject uploads before any filesystem or parser work when s.db.ReadOnly() is true, or unregister /api /v1/sessions/upload entirely in PG read-only mode. The guard needs to happen before saveUpload is called.

4. Incorrect substring search behavior in PG read mode

  • Location: internal/pgdb/messages.go:142-154

Description: In Search(), the query passes the escapeLike-escaped string ($1) to POSITION(LOWER($1) IN ...). Since POSITION searches for literal substrings (not LIKE patterns), searching for terms with metacharacters (e.g., \ ) will look for the escaped version (e.g., \\) and fail to center the snippet on the match.

  • Suggested Fix: Pass the unescaped searchTerm as a separate bind parameter (e.g., $2) for use in the POSITION calls.

5
. Permanently deleted sessions are not tracked or synced to PG

  • Location: internal/db/sessions.go:1141, internal/pgsync/push.go:61
  • Description: If a user permanently deletes a session locally (e.g., Empty
    Trash) before a PG sync cycle runs, the row is removed from SQLite. ListSessionsModifiedBetween won't return it, meaning PG sync is never notified of the deletion and the session remains fully active indefinitely in PostgreSQL.
  • Suggested Fix: Use a tombstone table to track permanently deleted IDs, or document this
    edge-case limitation if acceptable for the push-only model.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (bdfb4fe)

Verdict: The PR successfully introduces PostgreSQL sync and read-only modes,
but contains a few medium-severity logic and routing issues that should be addressed before merging.

Medium

  • internal/pgdb/sessions.go: buildPG SessionFilter's IncludeChildren branch does not actually restrict the direct-match arm to root sessions. relationship_type NOT IN ('subagent', 'fork') is removed from basePreds, but never added back to rootMatch, so child sessions can satisfy the filter on their own
    instead of only being included through a matching parent. That contradicts the comment and changes filter semantics. Suggested fix: make the direct-match side explicitly root-only, e.g. (<root predicate> AND <rootMatch>) OR parent_session_id IN (...).
  • internal/pgsync
    /push.go
    :
    pushMessages can incorrectly skip sessions whose tool-call data changed. The skip heuristic only compares message count/length aggregates plus tool-call
    COUNT(*) and SUM(result_content_length), so changes to tool_name, category, input_json, skill_name, subagent_session_id, or same-length result content are invisible. That leaves stale PG tool-call rows after parser changes or corrected imports
    . Suggested fix: compute a stable fingerprint across all synced tool-call fields, or disable the skip heuristic when tool calls are present.
  • [cmd/agentsview/main.go](/home/roborev/.roborev/clones/wesm/agentsview/cmd/agentsview/
    main.go):
    the new cfg.PGReadURL early return sends read-only mode into runServePGRead, which reimplements startup and calls http.ListenAndServe directly. That bypasses the remainder of runServe, so shared serve initialization later in that function will not apply to PG
    -read mode. In this codebase that is risky because serve behavior includes more than just DB selection. Suggested fix: route PG-read through the same startup path as normal serve, or explicitly reject incompatible options when pg_read is enabled.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (369b351)

Verdict: Several medium
-to-high severity issues found, primarily concerning the security and validation of the new PG-read serve mode, and edge cases in the sync engine.

High Severity

  • [cmd/agentsview/main.go:684](/home/roborev/.roborev/clones/
    wesm/agentsview/cmd/agentsview/main.go#L684)

    The new pg-read serve path explicitly ignores remote_access and managed proxy settings, but it still binds to arbitrary cfg.Host and starts a normal HTTP listener. That creates an
    unauthenticated exposure path for PostgreSQL-backed mode if the operator uses --host 0.0.0.0, a non-loopback config value, or a reverse proxy/tunnel in front of loopback. This violates the repo’s stated security model that any remote-access deployment must require bearer-token
    auth.
    Suggested fix: In pg-read mode, either hard-reject any non-loopback bind and any proxy/remote-access configuration, or preserve the normal remote-access auth path and require bearer-token enforcement before serving requests. At minimum, add validation that pg-read may only run on loopback unless authenticated remote access is fully supported.

Medium Severity

runServe still calls validateServeConfig(cfg) before it branches into runServePGRead, but runServePGRead then says remote_access, proxy, and pg_sync are ignored in PG-read mode. That means an otherwise-ignored invalid proxy/remote-
access config can still prevent PG-read startup.
Suggested fix: Decide on PG-read mode before full serve validation, or add a PG-read-specific validation path that strips ignored settings first.

agentsview sync -pg-status never calls EnsureSchema, while PGSync.Status() immediately queries agentsview.sessions and agentsview.messages. Against a fresh PostgreSQL database, status will fail instead of reporting an empty state.
Suggested fix: Call EnsureSchema on
the status path too, or make Status treat a missing schema as zero sessions/messages.

  • internal/pgsync/push.go, internal
    /db/messages.go

    The message-sync skip heuristic is too weak. It only compares message count, message content-length aggregates, tool-call count, and summed
    result_content_length. Real changes with identical lengths still get skipped, and the watermark advances, so PG can stay permanently stale. Tool metadata changes are especially vulnerable here.
    Suggested fix: Use a stronger fingerprint or hash over message/tool-call content and metadata, or store a per-session
    content/version hash in SQLite and compare that instead.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (e97baea)

Summary Verdict: The PR successfully implements the PostgreSQL push-sync and read-only serving modes, but contains medium-severity issues related to background sync configuration, schema migration back
filling, and insecure transport defaults.

Medium

  • Background PG sync fails to start with documented config
    In cmd/agentsview/main.go#L253, server startup
    only enables periodic PG sync when pgCfg.Enabled && pgCfg.PostgresURL != "". However, internal/config/config.go#L259 only sets Enabled if the config
    explicitly contains "enabled": true. This contradicts the documentation in README.md#L180 where pg_sync is shown without enabled and is said to auto-run.
    Suggested Fix: Treat
    postgres_url presence as sufficient to enable server-side PG sync, or default Enabled to true inside ResolvePGSync() when a URL is present. Add test coverage for automatic startup.

  • Schema migration missing backfill for created_at
    In internal/pgsync/schema.
    go#L92
    , created_at TEXT NOT NULL DEFAULT '' is added to existing agentsview.sessions without populating it from older columns. The new PG reader uses this field for pagination and
    sorting in internal/pgdb/sessions.go#L27 and [internal/pgdb/sessions.go#L301](/tmp/agentsview-review/internal/pgdb/
    sessions.go#L301). For historical rows, created_at='' until re-pushed, causing them to sort incorrectly and fall out of date-filtered views in PG read mode.
    Suggested Fix: Backfill created_at during migration using COALESCE( NULLIF(started_at,''), NULLIF(ended_at,''), updated_at) or force a one-time full re-push. Add an upgrade test from the pre-created_at schema.

  • Insecure PostgreSQL transport allowed and documented
    The new sync and read
    -only modes accept any connection string without validating transport security, and examples explicitly recommend sslmode=disable (internal/pgsync/pgsync.go:54, internal/pgdb/pgdb.go:27, [README.md:181](/home/roborev/.roborev/cl
    ones/wesm/agentsview/README.md), README.md:203, [CLAUDE.md:100](/home/roborev/.roborev
    /clones/wesm/agentsview/CLAUDE.md)). For non-local deployments, this exposes credentials and session data to MITM or sniffing attacks.
    Suggested Fix: Reject sslmode=disable for non-loopback/non-Unix-socket targets unless an explicit
    allow_insecure_pg override is set, or emit a prominent startup warning. Update documentation to prefer sslmode=require or sslmode=verify-full, and document insecure mode as local-dev-only.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (10b059c)

The PR successfully implements PostgreSQL sync and read-only modes, but introduces high-severity authentication configuration issues and medium-severity functional and transport security concerns that need addressing.

High

  • Authentication
    Configuration Mismatch in PG Read Mode

    remote_access is logged as “ignored” in PG read mode, but the flag is never cleared before constructing the server. [cmd/agentsview/main.go#L690](/home/roborev/.roborev/clones/wes
    m/agentsview/cmd/agentsview/main.go#L690) passes the unchanged cfg into server.New, and [internal/server/auth.go#L53](/home/roborev/.roborev/clones/wesm/agentsview
    /internal/server/auth.go#L53) still keys auth behavior off cfg.RemoteAccess. In practice, a config with remote_access: true can make serve -pg-read require a bearer token on localhost, or 403 every API request if no token
    is configured, despite the startup warning saying the setting is ignored.
    Suggested fix: Explicitly zero out cfg.RemoteAccess (and any related auth state) inside runServePGRead, or bypass auth middleware for this mode. Add a server test for -pg-read with remote _access: true.

Medium

  • Missing Live Updates in PG Read Mode
    PG read mode cannot emit live session updates, so viewers won’t refresh when another machine pushes new data. [internal/pgdb/pgdb.go#L158](/home/roborev/.
    roborev/clones/wesm/agentsview/internal/pgdb/pgdb.go#L158) hardcodes GetSessionVersion() to (0, 0, false), and [internal/server/events.go#L55](/home/robore
    v/.roborev/clones/wesm/agentsview/internal/server/events.go#L55) short-circuits the session monitor when s.engine == nil. That leaves /watch as a heartbeat-only no-op in the new shared PG viewer mode, which
    is a functional regression for active sessions.
    Suggested fix: Implement a PG-backed version signal (message_count, max ordinal, or updated_at) or return a clear 501 so the client can fall back instead of silently pretending to watch.

  • Unenforced
    TLS for Remote PostgreSQL Connections

    The new PostgreSQL sync and pg-read paths accept arbitrary connection strings without enforcing TLS for non-local hosts. With pgx/libpq-style parsing, an omitted sslmode uses prefer, which can fall back to an unencrypted connection if TLS is
    unavailable. That can expose database credentials and mirrored session contents to network interception when the PostgreSQL server is remote. See [internal/pgsync/pgsync.go:55](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/pgsync.
    go) and internal/pgdb/pgdb.go:26.
    Suggested remediation: Parse the DSN before connecting and reject remote/
    non-loopback hosts unless sslmode is require, verify-ca, or verify-full. Only allow disable for loopback or Unix-socket connections, and fail closed otherwise.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (7254492)

Verdict: The implementation of PostgreSQL push sync and read-only mode looks solid overall, but there are a couple of medium-severity issues regarding schema migrations and API response consistency in the new PG-backed paths.

Medium

  • Missing Schema Migration on Startup

    • Files: cmd/agentsview/main.go, [internal/pgsync/schema.go](/
      home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/schema.go), [internal/pgsync/time.go](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/time
      .go), internal/pgdb/sessions.go
    • Description: PG read-only startup does not run the shared PG schema migration path, which
      can cause breakage on databases produced by older sync builds. runServePGRead opens PG via pgdb.New(...) and starts serving immediately. Since PG read queries now require columns like created_at, a pre-upgrade PG schema will fail unless a separate writer runs EnsureSchema.
    • **
      Recommendation**: Run pgsync.EnsureSchemaDB(...) or a read-only schema validation/migration step before serving in -pg-read mode.
  • Inconsistent JSON Serialization for Empty Slices

    • File: [internal/pgdb/pgdb.go](/home/
      roborev/.roborev/clones/wesm/agentsview/internal/pgdb/pgdb.go)
    • Description: The read-only PG store returns nil slices for endpoints like ListPinnedMessages and ListInsights. These encode as null in
      JSON instead of [], which can break frontend assumptions (e.g., .length or .map checks). Existing SQLite-backed handlers typically return empty arrays.
    • Recommendation: Consistently return empty slices ([]db.PinnedMessage{}, []db.Insight{}) rather than
      nil.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (daa1b4f)

Code Review Summary: 4 Medium severity findings related to search logic
, sync heuristics, and connection security.

Medium

  • Empty search queries can return all messages
    Files: [internal/pgdb/messages.go#L114](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgdb
    /messages.go#L114)
    stripFTSQuotes() can turn a user query of "" into an empty string, and Search() then executes ILIKE '%' || $1 || '%', which matches every message. The server’s non-empty check happens before this
    normalization, so PG read mode can return a full-database search result set for a syntactically “empty” quoted query.
    Suggested fix: Validate again after stripFTSQuotes()/TrimSpace() and reject or short-circuit empty normalized terms. Add a regression test for q =%22%22.

  • SSE change detection misses message-only updates
    Files: [internal/pgdb/pgdb.go#L218](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgdb/
    pgdb.go#L218), internal/pgsync/push.go#L333
    PG-read SSE
    change detection only keys off sessions.updated_at, but pushMessages() can delete/reinsert messages and tool calls without updating the session row when session metadata is unchanged. That means message-only corrections pushed with -full (or any future parser/data repair that keeps counts stable) will not trigger
    session_updated, so an open PG-read client can keep showing stale content.
    Suggested fix: Bump sessions.updated_at whenever message/tool-call rows are rewritten, or include message/tool-call state in GetSessionVersion(). Add a watch/SSE regression test
    for message-only updates.

  • Inaccurate heuristic for unchanged sessions
    Files: [internal/pgsync/push.go#L381](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync/push.go#
    L381)
    The “unchanged session” heuristic only compares message count, tool-call count, and a few aggregate length stats. It will miss real changes when content is rewritten to the same byte lengths, or when timestamps/flags/tool metadata change without affecting those aggregates. In those cases,
    PG stays stale even though SQLite changed.
    Suggested fix: Replace the aggregate heuristic with a stable hash/fingerprint over normalized message and tool-call rows, or at minimum include timestamps/flags/tool metadata in the comparison. Tests are missing for same-length rewrites and metadata-only changes.

  • Non-local PostgreSQL connections allowed without enforced TLS
    Files: internal/pgdb/pgdb.go#L32, internal/pgsync/pgsync.go#L31
    Both PG read mode and PG push sync only log a warning when
    a non-loopback DSN uses sslmode=disable, prefer, or allow. That means session data and database credentials can still be sent over an unencrypted or downgradeable channel if the user points the tool at a remote PostgreSQL host directly. Given these new features are explicitly for remote/shared PostgreSQL use,
    this is a real confidentiality risk.
    Suggested fix: Reject non-loopback PostgreSQL DSNs unless sslmode is at least require (preferably verify-full), or require an explicit allow_insecure_pg opt-in before permitting insecure modes.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@roborev-ci
Copy link

roborev-ci bot commented Mar 13, 2026

roborev: Combined Review (a0ddf35)

Verdict: The PostgreSQL integration and db.Store abstraction look solid overall, with a couple of
medium-severity issues regarding connection timeouts and configuration validation that need addressing.

Medium

  • internal/pgdb/pgdb.go:
    pgdb.New() uses pg.Ping() with no timeout. A bad -pg-read DSN or a blackholed network path can hang startup indefinitely. pgsync.New() already uses PingContext with a 10s timeout; the read-only path should do
    the same, and ideally pass a bounded context through the follow-on schema probe/migration calls too.
  • [cmd/agentsview/sync.go](/home/roborev/.roborev/clones/wesm/agentsview/cmd/agentsview/sync.go#L1
    23): agentsview sync -pg hard-fails if pg_sync.interval is invalid, even though a one-shot push does not use the periodic interval at all. That makes the manual recovery path fail for a misconfigured background setting. Suggested fix: parse/validate interval only
    for periodic server startup, or pass 0 for one-shot CLI pushes.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@wesm
Copy link
Owner

wesm commented Mar 13, 2026

I'll get to this within a few days, just bear with me

@tlmaloney
Copy link
Contributor Author

I'll get to this within a few days, just bear with me

No rush, I'm still iterating and trying to converge with a passing roborev verdict. I'll follow up here and tag you when ready.

@roborev-ci
Copy link

roborev-ci bot commented Mar 14, 2026

roborev: Combined Review (ee0025b)

Summary Verdict: The PR successfully introduces PostgreSQL sync and read-only serving, but contains medium-severity logical flaws in the push synchronization boundary and watermark handling that could lead to redundant pushes
or permanently missed session updates.

Medium

  • internal/pgsync/push.go in Push/finalizePushState/writePushBoundaryState

    When any session fails, finalizeCutoff is reset to the previous lastPush, but the boundary-state writer only records sessions whose marker equals the cutoff it is given. The successfully pushed sessions in the current (lastPush, cutoff] window therefore do not get fingerprinted, so the next run
    re-pushes them all again. That contradicts the comment and can keep causing redundant full re-pushes as long as one session keeps failing. Suggested fix: persist the boundary fingerprints against the current window cutoff separately from last_push_at, or otherwise keep enough state to suppress already-successful sessions when the
    watermark is intentionally not advanced.

  • internal/pgsync/push.go and [internal/db/sessions.go](/home/roborev/.
    roborev/clones/wesm/agentsview/internal/db/sessions.go)
    in Push/ListSessionsModifiedBetween
    The push watermark is computed before querying candidates, and the lower bound is strict > since. If another local update lands after candidate selection but rounds to the
    same millisecond as cutoff, it is excluded from the current push and then excluded again on the next push because its marker is == last_push_at. Boundary replay only helps sessions that were already pushed, not late arrivals at the same cutoff. That can permanently miss a session update until it changes again.
    Suggested fix: use a consistent snapshot/transaction for selection plus watermarking, or make the next run replay the full cutoff boundary set, not just previously pushed sessions.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tlmaloney
Copy link
Contributor Author

Finding 1 (boundary fingerprints): Fixed in df26d91. writePushBoundaryState now stores fingerprints for ALL pushed sessions, not just boundary ones. Before the push loop, all candidates are checked against stored fingerprints and unchanged sessions are skipped. This prevents redundant re-pushes when the watermark is held back due to errors.

Finding 2 (late arrival at cutoff): Already handled by the boundary replay mechanism. When last_push_at = T1, a session modified at T1 after the query has marker == T1, which is not > T1 so it's excluded from the main query. But the boundary replay queries (T1-epsilon, T1], finds the session, sees no stored fingerprint for it (it was never pushed), and adds it to the candidate set. Verified by tracing the control flow.

@roborev-ci
Copy link

roborev-ci bot commented Mar 14, 2026

roborev: Combined Review (df26d91)

Verdict: The PR successfully implements PostgreSQL push-sync and read-only serve modes, but introduces medium-severity issues related to transport security, connection timeouts, and push sync efficiency.

Medium

  • Insecure default transport for remote
    PostgreSQL connections

    • Files: internal/pgutil/pgutil.go:24, internal/pgsync/pgsync.go
      :56
      , [internal/pgdb/pgdb.go:18](/home/roborev/.roborev/clones/
      wesm/agentsview/internal/pgdb/pgdb.go#L18)
    • Description: The new PostgreSQL sync and pg-read paths only log a warning for insecure remote DSNs, then proceed to connect anyway. This is a fail-open security posture. For non-
      loopback hosts, an unset sslmode or sslmode=prefer/allow can fall back to a non-TLS connection, which exposes PostgreSQL credentials and synced session data to interception or MITM on untrusted networks.
    • Remediation: Reject insecure sslmode values for
      non-loopback hosts by default. At minimum, require sslmode=require; preferably require verify-full for remote hosts. If insecure operation must remain possible, gate it behind an explicit opt-in such as allow_insecure_pg: true and fail startup otherwise.
  • **
    Missing connection timeout during startup**

    • File: internal/pgdb/pgdb.go
    • Description: New() uses pg .Ping() with no timeout. In serve -pg-read, an unreachable or blackholed PostgreSQL endpoint can hang startup indefinitely instead of failing fast.
    • Remediation: Use PingContext with a 10s timeout, matching the existing behavior in pgsync.New(). Add
      a test for startup against a non-responsive PG endpoint.
  • Inefficient partial failure retry logic in push sync

    • File: [internal/pgsync/push.go](/home/roborev/.roborev/clones/wesm/agentsview/internal/pgsync
      /push.go)
    • Description: The "don't re-push successful sessions after a partial failure" logic is gated on last_push_at != "". On the first push, if one session fails, last_push_at stays empty. Consequently, the next run
      ignores the saved boundary fingerprints and re-uploads every successful session again, contradicting the intended behavior and potentially causing repeated large re-pushes until a clean run occurs.
    • Remediation: Read boundary state independently of whether the watermark is empty, or store the last attempted cutoff separately. Add a test for
      an initial push with one failing session.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

tlmaloney pushed a commit to tlmaloney/agentsview that referenced this pull request Mar 14, 2026
…h boundary state

Address three findings from PR wesm#167 combined review:

1. CheckSSL rejects insecure sslmode for non-loopback PG hosts at
   startup (pgdb.New and pgsync.New). AllowInsecurePG config option
   falls back to the existing warning for local dev setups.

2. pgdb.New now uses PingContext with a 10s timeout, matching
   pgsync.New.

3. readPushedFingerprints reads boundary state fingerprints without
   cutoff validation, so partial failures on the first push (when
   lastPush is empty) are still deduplicated on the next cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link

roborev-ci bot commented Mar 14, 2026

roborev: Combined Review (113e32f)

Verdict:
The PR introduces optional PostgreSQL sync and read-only mode, but contains medium-severity issues regarding SSL enforcement bypass and test compilation failures.

Findings

[Medium] PostgreSQL TLS enforcement bypass in DSN parsing

  • Files: internal/pgutil/pgutil.go (lines 2
    2, 56), internal/pgsync/pgsync.go (line 47), internal/pgdb/pgdb.go (line 20)
  • Description: CheckSSL and WarnInsecureSSL use a custom string parser to determine if a remote
    PostgreSQL connection may use sslmode=disable. This parser only handles a narrow subset of DSN syntax. Connection inputs that pgx can resolve through other mechanisms—such as libpq-style keyword strings, repeated parameters, or service-file indirection—can cause host == "" or the wrong host to
    be inspected. In these cases, the code fails open and permits an unencrypted connection to a non-loopback server, exposing synced session data in transit and breaking the intended "non-loopback PG must use TLS" invariant.
  • Suggested Fix: Stop parsing DSNs manually for security decisions. Use
    the same library the driver uses to parse the config, inspect the resolved configuration, and enforce TLS on the effective destination after all indirection is applied. If impractical, fail closed on any DSN format the local parser cannot fully and unambiguously interpret.

[Medium] Test suite compilation failures due to signature changes

Files: internal/pgdb/pgdb_test.go, internal/pgsync/integration_test.go, internal/pgsync/pgsync_test.go

  • Description: The PostgreSQL test suites still call the old constructors after pgdb.New and pgsync. New gained the allowInsecure bool parameter. Calls like pgdb.New(pgURL) and pgsync.New(pgURL, local, "...", time.Hour) will cause compilation failures in the pgtest packages.
  • Suggested Fix: Update all call sites in the tests to
    pass the new allowInsecure boolean parameter, and add a small test covering this new parameter path.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

@tlmaloney
Copy link
Contributor Author

Addressed: Combined Review (113e32f)

[Medium] PostgreSQL TLS enforcement bypass in DSN parsing

Fixed in 39ceaba. Replaced the custom DSN parser with pgconn.ParseConfig from the pgx driver itself. This eliminates all format-specific bypasses (service files, env var indirection, exotic key-value syntax) since we now use the exact same parser the driver uses.

CheckSSL now inspects TLSConfig on both the primary config and all fallback configs, rejecting any connection chain that permits plaintext for non-loopback hosts. This correctly rejects sslmode=disable, allow, and prefer (which have plaintext fallback paths).

The custom ParseSSLParams and KVParam functions have been removed entirely (-154 lines, +55 lines).

[Medium] Test suite compilation failures

False positive. The pgtest-tagged test files (pgdb_test.go, integration_test.go, pgsync_test.go) use sql.Open("pgx", url) directly — they do not call pgdb.New or pgsync.New. Verified with grep and confirmed all tests compile and pass with CGO_ENABLED=1 go test -tags fts5 ./....

@roborev-ci
Copy link

roborev-ci bot commented Mar 14, 2026

roborev: Combined Review (53e5e8b)

Synthesis unavailable. Showing individual review outputs.

codex — default (done)

Reviewing the 58-commit range by tracing the new PostgreSQL sync/read-only paths and the sidebar machine filter changes. I’m checking for behavioral regressions first, then gaps in tests around the new state/watermark logic and PG-backed server mode.
I’ve narrowed this to a few areas where regressions are most likely: PG-backed session queries versus the SQLite behavior they replace, and the sync watermark/fingerprint path that tries to skip redundant pushes after partial failures. I’m pulling exact line references from the repo now.
The working tree doesn’t contain these new files, so I’m treating the pasted diff as the review source of truth and using the existing repo only for baseline comparisons where needed. I’m checking the current SQLite behavior to verify whether the new PG implementations actually match it.
One likely regression is in the PG read-mode watcher path. I’m verifying whether the new DB-only polling code still notifies the client when a watched session disappears or is soft-deleted, because the local path had extra file/sync fallbacks around that.
I’m checking the server-side write endpoints next. Read-only mode added stubs in the new db.Store, but only some handlers were updated to translate ErrReadOnly into a clean 501. Missing one would turn a normal read-only request into an internal error.
I’m checking whether the resolved PostgreSQL DSN ever gets reflected back through the HTTP config endpoints. The serve path now expands env vars into the in-memory config before constructing the server, which could accidentally expose credentials if the server returns that struct anywhere.
I’ve got the concrete findings. I’m pulling the current file locations for the existing code paths so I can attach precise references where the regressions land, then I’ll summarize the branch as a whole.
Summary

This range adds optional PostgreSQL support in two directions: push-only sync from SQLite to PG, and a PG-backed read-only server mode. It also adds a machine filter in the sidebar and ports a large part of the query/analytics surface to a new PG store.

Findings

  • Medium: agentsview sync -pg now fails on an invalid pg_sync.interval even though the one-shot push path never uses the interval. This makes the recovery/manual-export command unusable if someone has a bad background-sync interval in config. The regression is in cmd/agentsview/sync.go inside runPGSync, where time.ParseDuration(pgCfg.Interval) is gated on cfg.PG instead of on “periodic mode”. Suggested fix: only parse/validate the interval when starting periodic sync from serve, not for one-shot sync -pg. Add a CLI test for sync -pg with a malformed interval.

  • Medium: the new machine filter can offer values that can never produce visible results, because the machine list is sourced from an unfiltered distinct-machine query. In SQLite, internal/db/sessions.go GetMachines is just SELECT DISTINCT machine FROM sessions, so deleted sessions, zero-message sessions, and child-only sessions can leak into the dropdown; the new PG implementation mirrors the same mismatch. Since the session list itself filters those rows out, selecting one of these machines yields an empty list and looks broken. Suggested fix: align GetMachines with the same visibility predicate used for sidebar sessions (message_count > 0, deleted_at IS NULL, root-session filtering, and one-shot handling). Add a test that deleted/child-only machines are excluded.

  • Low: PG read-only serve mode drops the normal browser-open behavior. In normal serve, cmd/agentsview/main.go prints the URL and calls openBrowser(url) unless -no-browser is set. The new runServePGRead path prints the URL but never opens the browser, so default UX regresses and -no-browser becomes effectively irrelevant there. Suggested fix: mirror the normal serve path and call openBrowser(srvURL) when !cfg.NoBrowser.


codex — security (done)

Reviewing the PG sync and PG read-mode changes for security-impacting behavior. I’m tracing the new network exposure, auth boundaries, DSN handling, and SQL construction paths, then I’ll reduce that to concrete findings only if there’s an actual security issue.
I’ve narrowed the review to the new trust boundaries: PostgreSQL DSN parsing/TLS checks, PG-backed read-only serving, and dynamic SQL in the PG store. Next I’m validating exact file locations and checking whether any write-capable or unauthenticated paths remain reachable in the new mode.
The working tree doesn’t appear to contain these new PG files yet, so I’m using the supplied diff as primary evidence and checking the repo state only to anchor line references where possible. I’m verifying whether this is a pre-merge review against a patch rather than the checked-out filesystem.
I found one area that may matter: the new TLS guard relies on parsed DSN host information, and this class of code is historically easy to get wrong with PostgreSQL hostaddr semantics. I’m checking whether the chosen parser field actually represents the real network destination before I decide whether this is a concrete issue or a false alarm.
Summary: I did not identify any concrete security issues in the provided diff. The new PostgreSQL paths appear to preserve the repo’s stated threat model: PG read mode is forced to loopback-only, write-capable handlers are blocked in read-only mode, SQL construction in the new PG store is parameterized or restricted to compile-time trusted fragments, and non-loopback PostgreSQL connections are rejected unless the operator explicitly opts into insecure transport.

No issues found.

Residual risk: I could not execute the new PostgreSQL code paths in this workspace because the checked-out tree does not contain the added PG files and the Go toolchain is unavailable here, so this review is based on static analysis of the supplied diff only. The main area worth keeping an eye on in follow-up testing is DSN edge-case handling in internal/pgutil/pgutil.go for unusual PostgreSQL connection-string forms.


gemini — default (done)

Summary: Implement PostgreSQL push sync and read-only serving mode with comprehensive schema handling, chunked query execution, and robust sync state tracking.

Review Findings:

No issues found.

@tlmaloney
Copy link
Contributor Author

Addressed the combined review (53e5e8b) findings:

Fixed:

  • Medium (sync -pg interval): One-shot sync -pg no longer validates pg_sync.interval — the interval is only used by periodic sync, so a bad config value no longer blocks manual export/recovery.
  • Medium (GetMachines visibility): GetMachines now filters message_count > 0 in both SQLite and PG implementations, matching the session list visibility predicate. Added test for zero-message exclusion.

Not applicable:

  • Low (PG read-only browser open): False positive — the normal serve path also does not call openBrowser. The NoBrowser config field is parsed but not referenced in the serve code. No regression.

@wesm
Copy link
Owner

wesm commented Mar 14, 2026

@tlmaloney since this is a large PR it's going to be hard to get a clean review, so it might make sense to pause here so I can have a chance to review some of the high level details

@tlmaloney
Copy link
Contributor Author

@tlmaloney since this is a large PR it's going to be hard to get a clean review, so it might make sense to pause here so I can have a chance to review some of the high level details

Agreed, I am not having luck converging on a passing verdict. I'll pause and happy to squash / resubmit / or whichever direction makes sense.

@wesm
Copy link
Owner

wesm commented Mar 15, 2026

I'm going to cut 0.14.0 before getting to this because there's a good amount of work to do here, mainly around making sure the design / ergonomics are right. I have a lot going on right now so just bear with me!

@tlmaloney
Copy link
Contributor Author

I'm going to cut 0.14.0 before getting to this because there's a good amount of work to do here, mainly around making sure the design / ergonomics are right. I have a lot going on right now so just bear with me!

No worries, I went down a bit of a review-fix rabbit hole and should have stopped earlier to align on approach. Happy to wait for you to free up.

@tlmaloney
Copy link
Contributor Author

@wesm If it's any easier, I can split this PR over a few, here's a suggested partition. By the first commit, there was already 7k new additions. Either way, we need to make sure the approach is right.

PR 1: db.Store interface + SQLite plumbing (~580 lines)
Files:

internal/db/store.go — the Store interface that both SQLite and PG implement
internal/db/db.go, sessions.go, messages.go, schema.sql — sync_state table, local_modified_at, GetAllSessionFingerprints, GetMessageHashes, GetMachines
internal/db/db_test.go, filter_test.go
Why first: Everything else depends on this interface. Zero-risk — adds columns, queries, and an interface that nothing calls yet. Fast to review, fast to merge.

PR 2: PG push sync engine (~2,600 lines)
Files:

internal/pgutil/ — shared DSN parsing, SSL enforcement
internal/pgsync/ — schema DDL, push logic, lifecycle, time utils, all tests
internal/config/config.go — PG sync config fields only
internal/config/config_test.go
go.mod/go.sum (pgx)
Why second: This is the core feature and where all the bugs lived. Depends on PR 1's Store interface and fingerprint queries. At ~2,600 lines it's still large, but it's a single cohesive package with clear boundaries. All the review attention should go here.

If even this feels too big, split it further:

PR 2a: pgutil/ + pgsync/schema.go + pgsync/time.go (~600 lines) — pure utilities, no state
PR 2b: pgsync/push.go + pgsync/pgsync.go + config + tests (~2,000 lines) — the actual engine
PR 3: PG read-only store (pgdb) (~3,460 lines)
Files:

internal/pgdb/ — all files
Why separate: This is the biggest component (3,460 lines), but it's mechanically straightforward — it reimplements the db.Store interface against PostgreSQL. It has no dependency on pgsync at all; it only needs the Store interface from PR 1 and the PG schema from PR 2.

The 1,912-line analytics.go is the elephant. It's largely a port of SQLite analytics queries to PG SQL. Boring to write, boring to review, but big. If you want to shrink this PR further, analytics could be a follow-up PR 3b.

PR 4: PG read-only server mode + CLI (~510 lines)
Files:

cmd/agentsview/main.go — runServePGRead(), pg-read startup
cmd/agentsview/sync.go, sync_test.go — -pg, -pg-status, -full flags
internal/server/ — ReadOnly() guards, pg-read SSE polling, route gating
internal/server/events.go — pollDBOnly path
Why last: This is the glue that wires PR 2 and PR 3 into the running application. Small, focused on plumbing. Depends on everything above.

PR 5: Machine filter (~100 lines)
Files:

frontend/src/lib/components/sidebar/SessionList.svelte
frontend/src/lib/stores/sessions.svelte.ts
frontend/src/lib/stores/sessions.test.ts
internal/db/sessions.go (GetMachines if not already in PR 1)
Why separate: Completely independent feature. Could have merged on day 1 while the PG work was still in progress.

PR 6: CI/docs/cleanup (~100 lines)
.github/workflows/ changes
README.md, CLAUDE.md updates
Makefile, desktop/package.json

@wesm
Copy link
Owner

wesm commented Mar 17, 2026

I'll get there, just please bear with me! I am working on a bunch of parallel projects and want to make sure this PR doesn't get rushed. I have a postgres db that I already use for roborev so I can go through the e2e testing motion

- feat: add push-only PostgreSQL sync
- fix: address review findings for pg push sync
- fix: remove hardcoded PG credentials from integration tests
- feat: add PostgreSQL read-only server mode
- fix: address review findings for PG read-only mode
- feat: add machine filter to session list sidebar
- fix: address PR review findings for PG sync
- fix: address second round of PR review findings
- fix: address third round of PR review findings
- fix: address fourth round of PR review findings
- fix: warn about ignored options in pg-read mode
- fix: add missing space in PG schema updated_at DEFAULT clause
- fix: address sixth round of PR review findings
- docs: add PostgreSQL sync and read-only mode documentation
- fix: address seventh round of PR review findings
- fix: address eighth round of PR review findings
- test: add missing coverage for PG sync config and error handling
- fix: address ninth round of PR review findings
- fix: address tenth round of PR review findings
- fix: improve PG session version signal and DSN parsing
- fix: run schema migration in pg-read startup
- fix: return empty slices instead of nil in PG read-only stubs
- fix: make schema migration non-fatal in pg-read mode
- fix: only suppress read-only PG errors in pg-read migration
- fix: add schema compat probe and typed SQLSTATE detection for pg-read
- fix: close rows in CheckSchemaCompat to prevent connection leak
- refactor: extract shared DSN utilities into internal/pgutil
- fix: gate local-only handlers in pg-read mode and remove dead code
- fix: address branch review findings for pg-read mode
- fix: add ESCAPE clause to ILIKE and fix top sessions time filtering
- fix: use E-string syntax for ILIKE ESCAPE clause
- fix: address third round of PR review findings
- fix: compute duration ranking in Go instead of SQL ::timestamp cast
- fix: use exact duration for ranking, skip unparseable timestamps
- test: add unit tests for top sessions duration ranking
- fix: add missing created_at column to pgdb test DDL
- fix: drop and recreate tables in pgdb test DDL
- fix: check DROP TABLE errors in pgdb test DDL
- fix: use DROP SCHEMA CASCADE in pgdb test DDL
- fix: address full-branch review findings
- fix: add test coverage and make push message errors per-session
- fix: exclude failed sessions from push boundary state
- fix: do not advance last_push_at when sessions fail
- fix: make pushSession errors per-session and add ctx check in analytics
- fix: make all push loop errors per-session skip
- fix: surface PushResult.Errors in CLI and periodic sync output
- fix: defer push error exit until after pg-status output
- fix: reject empty search term in PG ILIKE search
- fix: fingerprint all pushed sessions to avoid redundant re-pushes
- fix: enforce SSL for non-loopback PG, add ping timeout, fix first-push boundary state
- fix: address review findings for SSL enforcement and push dedup
- fix: hostaddr bypass, fingerprint merge on repeated failures, error propagation
- test: add coverage for fingerprint merge and loopback hostaddr override
- fix: use pgconn.ParseConfig for DSN parsing in SSL enforcement
- test: re-add default sslmode test case for remote PG hosts
- fix: add missing allowInsecure arg to pgtest-tagged test files
- fix: add missing allowInsecure arg to remaining pgsync test calls
- fix: add missing allowInsecure arg to TestNewRejectsEmptyURL
@wesm
Copy link
Owner

wesm commented Mar 17, 2026

I'm working on refactoring this to make some structural changes, so don't push any more work. We need containerized integration tests so I'm working on that too

@tlmaloney
Copy link
Contributor Author

I'm working on refactoring this to make some structural changes, so don't push any more work. We need containerized integration tests so I'm working on that too

Sounds good! Thank you. I spun up a Dockerized postgres, but didn't add that to the repo.

wesm and others added 4 commits March 17, 2026 16:29
- Use min() builtin instead of if-clamp in messages.go
- Use maps.Copy instead of manual loop in push.go

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- pg.url (was postgres_url) in bare env var warning
- pg.machine_name (was pg_sync.machine_name) in reserved name error
- allow_insecure under [pg] (was allow_insecure_pg) in SSL error
- Rename misleading "legacy" test names to describe actual behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Force UTC timezone on all PG connections via DSN parameter,
  and use AT TIME ZONE 'UTC' in DATE() filters for defense in depth
- Replace COALESCE(timestamp, NOW()) with NULLS LAST for stable
  search result ordering
- Use COALESCE(updated_at, created_at) instead of NOW() for
  stable session version detection
- Disable remote_access in pg serve (read-only, has own --host flag)
- Fix --full flag help text to describe both local resync and PG push

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CLAUDE.md: update config description (TOML, not legacy migration)
- CLAUDE.md: architecture diagram says "pg serve" not "pg-read mode"
- README.md: pg serve examples use CLI flags, not positional DSN arg
- README.md: clarify loopback is a default, not enforced

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link

roborev-ci bot commented Mar 17, 2026

roborev: Combined Review (007b031)

Summary Verdict: All agents agree the code is clean.

No Medium, High, or Critical issues were identified by any of the reviewing agents. (Note: Some agents experienced environment restrictions that limited their ability to fully inspect the local diff
, but no actionable findings were reported based on the accessible context).


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm and others added 3 commits March 17, 2026 17:19
- When pg serve binds to a non-loopback host, enable remote access
  and auto-generate an auth token (printed to stdout). Keeps
  remote access off for localhost binding.
- Add session_id and ordinal tie-breakers to search ORDER BY for
  deterministic pagination.
- Update README to document pg serve auth behavior accurately.
- CLAUDE.md: mention JSON migration in config description.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reuse the existing isLoopbackHost helper (which uses
net.ParseIP().IsLoopback()) instead of comparing three string
literals. Handles 127.0.0.2 and other loopback addresses correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Write SchemaVersion (currently 1) to sync_metadata on
EnsureSchema. Only bumps forward, never downgrades. Add
GetSchemaVersion helper for future migration detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@roborev-ci
Copy link

roborev-ci bot commented Mar 17, 2026

roborev: Combined Review (e772a02)

Summary: Refactors PostgreSQL sync functionality, migrates configuration to TOML, and updates host binding and remote access checks
.

All agents agree the code is clean; no issues were found.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

wesm and others added 7 commits March 17, 2026 20:19
pg serve now accepts --base-path (e.g. --base-path /agentsview) for
deployment behind a reverse proxy on a subpath. When set:

- http.StripPrefix removes the prefix before routing, so all
  existing routes work unchanged
- index.html is rewritten at serve time to prefix asset paths
  (href="/assets/..." -> href="/agentsview/assets/...") and
  inject a <base href="/agentsview/"> tag
- Frontend derives API base from document.baseURI instead of
  hardcoding /api/v1, so API calls go through the proxy path

Hash-based SPA routing (#/sessions, #/analytics) is unaffected
since the hash fragment never hits the server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Group sessions into batches of 50 per transaction instead of
  one transaction per session
- Use multi-row VALUES for message inserts (100 per batch) and
  tool_call inserts (50 per batch) instead of row-by-row
- Fix batch error handling: track per-batch pushed/message counts
  separately and undo correctly on rollback or commit failure
- Add integration tests for batch push (75 sessions across 2
  batches) and bulk message insert (250 messages with tool calls)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Guard StripPrefix with full path-segment check so /basepathFOO
  returns 404 instead of being handled as a sibling path
- Only derive frontend API base from document.baseURI when a real
  <base href> tag exists; fall back to /api/v1 otherwise so SPA
  fallback pages on non-root URLs produce correct API paths
- Add TestBasePath_RejectsSiblingPath test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rite

- Extract pushBatch method from inline batch loop
- On batch failure, retry each session individually so one bad
  session doesn't block up to 49 neighbors in the same batch
- Add explicit UPDATE sessions SET updated_at = NOW() after
  pushMessages rewrites rows, ensuring updated_at is bumped even
  when pushSession's upsert is a no-op (metadata unchanged)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
buildAllowedHosts only read from public_url (singular), so
public_origins whitelisted CORS but the host check middleware
still rejected requests with the origin's Host header. This
broke reverse-proxy deployments (e.g. Caddy on a tailnet) where
the proxy forwards the external hostname.

Now buildAllowedHosts also iterates over PublicOrigins and calls
addHostHeadersFromOrigin for each, so any origin trusted for
CORS is also trusted as a Host header value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PostgreSQL strictly enforces valid UTF-8. Tool call content from
parsed sessions can contain null bytes (0x00) and truncated
multi-byte sequences (e.g. an ellipsis character cut after 1 of
3 bytes), causing bulk insert failures on 8 sessions.

Three fixes:
- Add sanitizePG() that strips null bytes and replaces invalid
  UTF-8 via strings.ToValidUTF8; apply it to all text fields
  going into PG (nilStr, nilIfEmpty, message content)
- Fix truncate() in parser and truncateStr() in export to
  operate on rune boundaries instead of byte offsets, preventing
  creation of invalid UTF-8 at truncation points
- Clarify watermark behavior: advance only on zero errors,
  keep at lastPush on errors so failed sessions are retried
  (already-pushed sessions are fingerprint-skipped cheaply)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Review #10271: pushBatch no longer increments result.Errors — errors
are only counted for sessions that fail their individual retry after
a batch rollback. Extracted batchResult struct to separate concerns.

Review #10277: HTTPS host header test now covers both bare hostname
(viewer.example.test) and explicit port (:443) since browsers omit
the default port.

Review #10279: nilStr and nilIfEmpty now sanitize before checking
emptiness, so inputs like "\x00" correctly return nil instead of "".
Sanitization scoped to content-carrying fields (project, message
content, nullable text) — IDs, machine names, roles, and agent
names are tool-generated ASCII.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wesm
Copy link
Owner

wesm commented Mar 18, 2026

Getting close here. I ran into a cascade of problems pushing my agent session data to postgres and running agentsview pg serve behind a caddy proxy on my tailnet, so I'm fixing those issues now

@tlmaloney
Copy link
Contributor Author

Getting close here. I ran into a cascade of problems pushing my agent session data to postgres and running agentsview pg serve behind a caddy proxy on my tailnet, so I'm fixing those issues now

Thank you!! Looking at the first commits is super informative, I see the spec you created and refined. I am curious about your process.

I ran the integration tests and all pass. I also ran roborev branch reviews with Claude/Codex and they surfaced some high/medium issues. Is CI just a HEAD review? I'll share more tomorrow as you're still working on the frontend issues and it's getting late.

wesm and others added 4 commits March 17, 2026 21:20
When the local watermark says we've pushed before but PG has zero
sessions for this machine, the PG side was reset (schema dropped,
DB recreated, etc.). Push now detects this mismatch and
automatically forces a full resync instead of only pushing
recently modified sessions.

Add TestPushDetectsSchemaReset integration test that pushes a
session, drops and recreates the PG schema, then verifies an
incremental push auto-detects the reset and re-pushes everything.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add sanitizePG to tool_name, category, and tool_use_id in
  bulkInsertToolCalls to handle null bytes and invalid UTF-8
- Clear schemaDone memoization in coherence check so Push can
  recreate the schema after it has been dropped externally
- Call normalizeSyncTimestamps after schema reset detection to
  ensure timestamps are set up before schema recreation
- Fix TestPushDetectsSchemaReset to exercise the real reset path
  instead of manually calling EnsureSchema (which was memoized)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When Push(ctx, true) is called after the PG schema has been dropped,
schemaDone was still memoized from the initial push, causing
EnsureSchema to skip and inserts to fail against missing tables.

Clear schemaDone and re-run normalizeSyncTimestamps when full is
true, not only in the coherence check path.

Add TestPushFullAfterSchemaDropRecreatesSchema regression test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UpsertSession takes no context parameter. Fixes compile error in
TestPushFullAfterSchemaDropRecreatesSchema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wesm
Copy link
Owner

wesm commented Mar 18, 2026

I see the spec you created and refined. I am curious about your process.

I took a look at the API and the code structure and proposed some changes that I preferred, and used obra/superpowers to create and execute the spec, then used roborev to address bugs and problems left behind by the superpowers implementation pass (because superpowers doesn't do much review aside from evaluating spec conformance, so they two systems actually work very well together).

the CI reviews are just branch reviews with roborev (2 reviews with codex and 1 with gemini)

@roborev-ci
Copy link

roborev-ci bot commented Mar 18, 2026

roborev: Combined Review (9f3fa5b)

Summary: The PR implements significant PostgreSQL and configuration updates but introduces high-severity issues with absolute asset paths that will break reverse-proxy sub
path deployments.

High

  • frontend/index.html:6
    • Problem: Absolute path used for the favicon (href="/favicon.svg"). When the application is deployed under a reverse-proxy subpath, this will resolve to the domain root and result in a 4
  • Fix: Change to a relative path (e.g., href="./favicon.svg") or have the backend dynamically inject a <base href="..."> tag based on the --base-path flag.

  • frontend/index.html:11

    • Problem: Absolute path used for the module script (src="/src/main.ts"). This will similarly break the application when served from a dynamic runtime subpath, as Vite will output absolute paths without relative base configuration.
    • Fix: Update to a relative path (src="./src /main.ts") and ensure the Vite build configuration (vite.config.ts) uses a relative base (e.g., base: './'), or inject the runtime base path into the HTML.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

- Use relative paths in index.html (./favicon.svg, ./src/main.ts) so
  assets resolve correctly under reverse-proxy subpath deployments
- Set Vite base to "./" so built output uses relative asset paths
- CI stub: create index.html (not stub.html) so base-path tests that
  serve the SPA root find a valid HTML document
- Add unreachable return after t.Fatal in openclaw_test.go and
  statefile_test.go to silence staticcheck SA5011 false positives

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@wesm
Copy link
Owner

wesm commented Mar 18, 2026

Merging this! thanks @tlmaloney

@wesm wesm merged commit 891cb62 into wesm:main Mar 18, 2026
9 of 10 checks passed
@wesm
Copy link
Owner

wesm commented Mar 18, 2026

I'll plan a release and start working on documentation for this tomorrow (I had my agent do a write-up on how I deployed this behind caddy on my tailnet at a path like http://$MACHINE/agentsview)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants