Storage Engine v4 — combined squash (for macro review) by c1-squire-dev[bot] · Pull Request #874 · ConductorOne/baton-sdk

c1-squire-dev · 2026-05-24T21:12:57Z

Purpose

This PR squashes the entire v4 storage-engine stack into a single
commit so reviewers can give macro / architectural / cross-cutting
feedback that's hard to see when reading the stack one PR at a time.

The individual stacked PRs are still where the actual code lands.
This combined PR exists only as a review aid. Do not merge.

Stack map

PR	What
#867	Parent: option plumbing + magic-byte dispatch
#868	Stack 1: protos, codec, engine skeleton
#869	Stack 2: c1z3 envelope format
#870	Stack 3: engine + all 6 record types + v2↔v3 translation + Writer adapter
#871	Stack 4: cross-engine compaction (IngestAndExcise)
#872	Stack 5: equivalence runner + microtests + bench CLIs

RFC

docs/rfcs/0004-storage-engine-v4/tracker.md tracks the implementation status.

Performance — current numbers (post fat-batch + fresh-sync NoSync)

Linux arm64, 16 cores, -benchtime=2x. Grant counts span the LSM-vs-B-tree
crossover. See commit 72881f65 for the perf-fix details.

Write + pack c1z

Grants	SQLite	Pebble	Pebble vs SQLite
100	14.2 ms	7.4 ms	1.93× faster
1,000	30.2 ms	10.7 ms	2.82× faster
10,000	218.6 ms	42.9 ms	5.10× faster
100,000	3.17 s	352 ms	9.00× faster
1,000,000	55.59 s	3.78 s	14.72× faster

Per-grant cost (the load-bearing metric):

Grants	SQLite µs/grant	Pebble µs/grant
100	142	74
1k	30	11
10k	22	4.3
100k	32	3.5
1M	56	3.8

SQLite's per-grant cost grows past 10k (B-tree depth + index maintenance);
Pebble's flatlines at ~3.8 µs once amortization kicks in. At 1M grants
Pebble writes the c1z file in 3.8 s vs SQLite's 55.6 s.

Unpack + read grants (apples-to-apples; both engines paginate)

Grants	SQLite	Pebble	Pebble vs SQLite
100	2.13 ms	2.17 ms	parity
1,000	5.97 ms	3.73 ms	1.60× faster
10,000	42.8 ms	15.6 ms	2.75× faster
100,000	386 ms	142 ms	2.72× faster

Both engines now clamp to page_size = 10000 by default; the bench
walks pages until next_page_token is empty.

Single grant write (cold engine setup cost)

Engine	Time/op	Allocs/op
Pebble	65 µs	21
SQLite	427 µs	314

Pebble engine startup is ~6.6× cheaper than opening a fresh SQLite file.

Run

go test -tags=batonsdkv2 -run='^$' \
  -bench 'BenchmarkRegistered(Pebble|SQLite)(WriteGrant|WritePack|UnpackReadGrants)$' \
  -benchtime=2x -benchmem -timeout=20m ./pkg/dotc1z/engine/pebble

# Override scales (default sweep is 100..1M):
BATONSDK_BENCH_SCALES=100,1000 go test -tags=batonsdkv2 -run='^$' \
  -bench 'BenchmarkRegisteredPebbleWritePack$' ./pkg/dotc1z/engine/pebble

Why Pebble is now faster — the perf-fix mechanism

Before commit 72881f65, PutGrants(N grants) was calling
Engine.PutGrantRecord per grant. Each call did:

Get (read-before-write for index cleanup) → 2 LSM lookups per grant
db.NewBatch() → alloc + close cycle per grant
Set primary + 2 index entries
batch.Commit(pebble.Sync) → one fsync per grant

For N grants that's N fsyncs vs SQLite's one fsync per chunked-INSERT
transaction. Linux fsync latency dominates at ~50 µs per call.

The fix mirrors SQLite's PRAGMA synchronous=NORMAL + chunked-INSERT
semantics:

Fat batch per Put call*: new variadic PutGrantRecords /
PutResourceRecords / PutEntitlementRecords / PutResourceTypeRecords
build a single pebble.Batch and commit once.
Fresh-sync NoSync write path: between MarkFreshSync (called by
Adapter.StartNewSync) and EndFreshSync (called by
Adapter.EndSync), batch commits use pebble.NoSync and skip
read-before-write index cleanup (the sync_id has no prior records).
EndFreshSync does one Flush + WAL fsync to harden everything.

Crash semantics are identical to SQLite's synchronous=NORMAL: a host
crash mid-sync forces the connector to replay. The sync_run record
stays "started" — we never silently surface a partial sync as
committed.

Review feedback already applied

btipling on Storage Engine v4 — Parent: option plumbing + magic-byte dispatch #867 — WithEngine option dropped during c1zOptions → c1fopts. Fixed + regression test.
pr-review bot on Storage Engine v4 — Parent: option plumbing + magic-byte dispatch #867 — WithEngine doc-comment merge; engine zero-value mismatch. Both fixed.
pr-review bot on Storage Engine v4 — combined squash (for macro review) #874 — Switched off the local-path Pebble replace and onto github.com/cockroachdb/pebble/v2 v2.1.5. CheckpointTo now flushes (was Quiesce); Save returns its shim error before side effects; ExtractZstdTar size-limits + bounded copies; writer Close errors propagated; writes serialized via shared withWrite.

Where to look first

For cross-cutting comments, the most architecturally load-bearing
files are:

pkg/dotc1z/engine/pebble/engine.go — lifecycle, Quiesce, Save, MarkFreshSync/EndFreshSync
pkg/dotc1z/engine/pebble/grants.go — fat-batch + fresh-sync write path
pkg/dotc1z/engine/pebble/adapter.go — connectorstore surface
pkg/dotc1z/engine/pebble/translate_v2.go — v2↔v3 boundary
pkg/dotc1z/engine/pebble/keys.go — key layout (Appendix C)
pkg/synccompactor/pebble/compactor.go — IngestAndExcise driver

Macro questions worth chewing on

Build-tag boundary — //go:build batonsdkv2 keeps connectors from linking Pebble. Acceptable test-matrix split?
Fresh-sync NoSync — Now matches SQLite's synchronous=NORMAL mid-sync. Should we also expose WithDurability(NoSync) for non-sync writes that want the same trade-off?
Pagination — Adapter List* methods honor page_size + page_token (default 10000, max 10000, matches SQLite). Opaque base64 cursor backed by the last pebble key returned.
Reference-stub hydration — v3 records carry refs, not embedded sub-messages. Reader returns stub v2.Entitlement/Resource. Caller fetches the full row separately.
Compaction granularity — Per-bucket IngestAndExcise is per-bucket atomic; multi-bucket loop is not transactional as a whole. Acceptable?
Reflection codec fallback — ReflectCodec covers MVP; codegen plugin deferred. ~2× perf delta in microbench. Acceptable for v4?

🤖 Generated with Claude Code

This PR squashes the entire stack-of-6 v4 storage-engine PRs into a single commit so reviewers can give macro overall feedback against the full design. The same code is reviewed in PRs #867–#872 broken into focused stacked PRs: #867 Parent: option plumbing + magic-byte dispatch #868 Stack 1: protos, codec layer, engine skeleton #869 Stack 2: c1z3 envelope format #870 Stack 3: Pebble engine + all 6 record types + v2↔v3 translation + connectorstore.Writer adapter #871 Stack 4: cross-engine compaction via IngestAndExcise #872 Stack 5: equivalence runner + microtests + bench CLIs Goal of this combined PR: surface macro / architectural / cross-cutting review comments that are hard to see in the individual stacked PRs. Land the individual PRs separately. ## RFC `docs/rfcs/0004-storage-engine-v4/tracker.md` tracks the implementation status of the broader RFC. ## Highlights - **Engine**: `pkg/dotc1z/engine/pebble/` — Pebble-backed v3 storage engine covering all 6 record types (Grant, Resource, ResourceType, Entitlement, Asset, SyncRun) with secondary indexes (by_entitlement, by_principal, by_parent, by_resource). Strict-quiesce write barrier via WaitGroup + memtable flush. `db.Checkpoint` powers Save. - **Translation**: `pkg/dotc1z/engine/pebble/translate_v2.go` — exhaustive v2 connector wire types ↔ v3 storage record types with nil-safe roundtrips. - **Adapter**: `pkg/dotc1z/engine/pebble/adapter.go` — implements `connectorstore.Writer` (which embeds `Reader`). Compile-asserted via `var _ connectorstore.Writer = (*Adapter)(nil)`. Embeds 9 `UnimplementedXxxServer` stubs for the exotic gRPC service surfaces. - **Compaction**: `pkg/synccompactor/pebble/` — cross-engine sync rollup via Pebble's `IngestAndExcise`. Per-bucket SST writer + excise span. Atomic from base's perspective. - **Envelope**: `pkg/dotc1z/format/v3/` — c1z3 file format with proto manifest (incl. descriptor closure for self-describing reads) and zstd-tar payload. - **Codec**: `pkg/dotc1z/engine/pebble/codec/` — FoundationDB-style tuple encoding with NUL/escape rules (prefix-free property exhaustively tested), KSUID binary encoding for sync_ids, frozen- after-init registry for codegen codecs + lazy reflection fallback. - **Equivalence**: `pkg/dotc1z/engine/equivalence/` — pluggable property-test harness comparing engines against an in-memory reference. - **Bench**: `cmd/baton-fixture-gen/` + `cmd/baton-storage-bench/` — hyper-scale fixture generation + G2–G5 read benchmarks against a fixture. - **Build-tag gated**: all engine code is behind `//go:build batonsdkv2` so default connector binaries don't link Pebble. ## Test coverage - 16 engine tests (record-type Put/Get/Delete/Iterate + checkpoint + quiesce) - 7 v2↔v3 translation roundtrip tests (incl. unknown-enum safety) - 6 connectorstore.Writer adapter end-to-end tests - 5 compactor tests (incl. sync-isolation property) - 4 equivalence tests (Pebble vs MemoryRef) - 1 tuple-encoding prefix-free property test (40k+ comparisons) - 4 codec direct-vs-reflection benchmarks - envelope roundtrip + truncation/bad-magic refusals - 2 regression tests for the WithEngine option threading bug Total: 50+ test cases passing; 0 lint findings across the v4 surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost · 2026-05-24T21:20:13Z

review update pending

ghost

Blocking issues found — see review comments.

c1-squire-dev · 2026-05-25T00:38:01Z

Addressed the review comments in this branch:

Removed the local Pebble filesystem replace. Pebble is now a normal module dependency pinned to upstream main at github.com/cockroachdb/pebble v0.0.0-20260523000819-06387e65044c (commit 06387e65044c) because the v2 tags require the /v2 module path and the non-v2 tags lag the API this branch uses.
CheckpointTo now flushes before db.Checkpoint instead of calling Quiesce, so the source engine stays writable after checkpointing.
Save now returns its current shim-required error before any side effects, so accidental calls do not quiesce/brick the engine.
Serialized engine writes through the shared withWrite path so read-before-write index cleanup is atomic with respect to other writes across the indexed record paths, not only grants.
writeZstdTar now propagates tar.Writer.Close and zstd Close errors instead of deferring and dropping flush failures.
ExtractZstdTar now rejects oversized entries, uses bounded copies, preserves close errors, and keeps the local-path guard before joining tar paths.
Added regression coverage for post-checkpoint writes, failed-Save writability, concurrent grant overwrite indexes, and oversized tar entries.

Validation run:

go test -v -tags batonsdkv2 ./pkg/dotc1z/engine/pebble ./pkg/dotc1z/format/v3 ./pkg/synccompactor/pebble passes.
golangci-lint run --timeout=3m --build-tags=batonsdkv2 ./pkg/dotc1z/engine/pebble/... ./pkg/dotc1z/format/v3/... ./pkg/synccompactor/pebble/... passes with 0 issues.
go test -v ./... passes.
make lint is still blocked by existing repo-wide findings outside this PR surface (for example pkg/sync/expand/scc/bitset.go, pkg/tasks/c1api/task_queue.go, unused //nolint:gosec directives, and pkg/types/types.go).

c1-squire-dev · 2026-05-25T00:40:14Z

Update after checking the Pebble release suggestion: switched from the upstream-main pseudo-version to the current tagged release github.com/cockroachdb/pebble/v2 v2.1.5.

A couple of details from the change:

The tag is a v2 module-path release, so the imports now use github.com/cockroachdb/pebble/v2 and related subpackages.
go.mod, go.sum, and vendor/modules.txt are pinned to v2.1.5; the local filesystem replace remains removed.
Kept the previous PR-comment fixes for CheckpointTo, Save, tar extraction/copy limits, writer close error propagation, and serialized indexed writes.
Set a discard Pebble logger in our engine options so v2.1.5 does not print WAL discovery/compaction logs during tests or normal SDK use.

Validation rerun after the v2.1.5 switch:

go test -v -tags batonsdkv2 ./pkg/dotc1z/engine/pebble ./pkg/dotc1z/format/v3 ./pkg/synccompactor/pebble passes.
golangci-lint run --timeout=3m --build-tags=batonsdkv2 ./pkg/dotc1z/engine/pebble/... ./pkg/dotc1z/format/v3/... ./pkg/synccompactor/pebble/... passes with 0 issues.
go test -v ./... passes.
make lint still fails only on existing repo-wide issues outside this PR surface (pkg/sync/expand/scc/bitset.go, pkg/tasks/c1api/task_queue.go, unused gosec nolints, and pkg/types/types.go).

c1-squire-dev · 2026-05-25T00:43:46Z

Pushed the fixes in 38437eb0 (fix(storage): address v4 review comments). This is the version using github.com/cockroachdb/pebble/v2 v2.1.5, with the local replace removed, checkpoint/save behavior fixed, tar close/error handling and extraction limits added, and Pebble writes serialized to cover the read-before-write index cleanup race. Validation remains: go test -v ./... passed; focused tagged tests and focused golangci-lint passed. Repo-wide make lint still reports pre-existing unrelated issues outside this PR.

ghost

No blocking issues found.

c1-squire-dev · 2026-05-25T01:17:25Z

Pushed follow-up commit 16eb1192.

What changed:

Removed the batonsdkv2 build tags from the Pebble/v3/equivalence/synccompactor implementation so this is no longer isolated behind a temporary tag.
Kept Pebble on the normal module dependency github.com/cockroachdb/pebble/v2 v2.1.5; no local replace or main-branch dependency.
Removed the deferred/follow-up markers from the new work. Descriptor closure now uses protodesc.ToFileDescriptorProto instead of a lossy local helper.
Expanded sync compaction to cover all currently implemented Pebble buckets/indexes: resource types, resources, resource_by_parent, entitlements, entitlement_by_resource, grants and grant indexes, assets, and sync runs.
Added an all-buckets compactor test to prove stale rows are replaced and fresh rows/indexes survive for the implemented buckets.

Validation:

go test -v ./... passes.
make lint passes with 0 issues.

ghost

No blocking issues found.

c1-squire-dev · 2026-05-25T01:26:11Z

Pushed optional-interface parity follow-up commit 2e776ee3.

Audit result:

SQLite *C1File implements the core connectorstore.Writer plus two optional connectorstore capabilities: connectorstore.LatestFinishedSyncIDFetcher and connectorstore.DBSizeProvider.
Pebble Adapter already had LatestFinishedSyncID; I added compile-time assertions so drift is caught.
Pebble was missing DBSizeProvider; Adapter.CurrentDBSizeBytes() now delegates to the engine, and the engine reports the regular-file byte total under the Pebble DB directory, covering WAL/log, MANIFEST, OPTIONS, and SST files as the uncompressed on-disk working set.

Additional behavior tests added:

Latest-finished sync behavior now covers no syncs, only-open syncs, per-type filtering, SyncTypeAny, and ignoring unfinished newer syncs.
Pebble DB-size behavior verifies initial nonzero size, exact regular-file sum parity, and non-decreasing size after writes through both engine and adapter paths.

Validation:

go test -v ./pkg/dotc1z/engine/pebble passes.
make lint passes with 0 issues.
go test -v ./... passes.

ghost

No blocking issues found.

c1-squire-dev · 2026-05-25T01:43:00Z

Pushed engine-registration follow-up commit 4d6edafb.

What changed:

Added a dotc1z engine registry and new engine-neutral dotc1z.NewStore(...) constructor. NewC1ZFile(...) remains the legacy/concrete SQLite constructor returning *C1File.
Kept SQLite registered by default without using init, so default callers continue to get SQLite behavior.
Moved Pebble opt-in behind pkg/dotc1z/engine/pebble.Register(). Core pkg/dotc1z does not import the Pebble engine package.
Existing files dispatch by header/manifest: v1 routes to SQLite; v3 reads the manifest engine name and requires a registered driver, otherwise returns ErrEngineNotAvailable.
Added a Pebble registered-store wrapper that opens/extracts v3 envelopes, tracks successful writes, checkpoints Pebble, writes the v3 envelope on close, and keeps read-only/no-write closes from rewriting files.
Made pebble.Register() idempotent for the same Pebble driver while keeping core registration strict against conflicting duplicate engines.

Tests added/updated:

Registry tests for duplicate registration, default SQLite behavior, missing registered engines, new-file registered dispatch, existing-v3 manifest dispatch, and existing-v1 ignoring requested engine.
Pebble registry roundtrip test through dotc1z.NewStore(..., WithEngine(EnginePebble)), close-to-v3 envelope, reopen through manifest dispatch, latest sync lookup, list, and get.

Dependency boundary checked:

go list -deps ./pkg/dotc1z | grep pkg/dotc1z/engine/pebble returns no results.

Validation:

go test -v ./pkg/dotc1z ./pkg/dotc1z/engine/pebble passes.
make lint passes with 0 issues.
go test -v ./... passes.

ghost

No blocking issues found.

ghost · 2026-05-25T02:36:18Z

+		out[k] = v2.GrantSources_GrantSource_builder{
+			IsDirect: v.GetIsDirect(),
+		}.Build()
+	}
+	return v2.GrantSources_builder{Sources: out}.Build()
+}
+
+// --- Resource ---
+
+// V2ResourceToV3 maps v2.Resource → v3.ResourceRecord. Caller supplies sync_id.
+func V2ResourceToV3(syncID string, r *v2.Resource) *v3.ResourceRecord {
+	if r == nil {
+		return nil
+	}
+	var parent *v3.ResourceRef
+	if pid := r.GetParentResourceId(); pid != nil && pid.GetResource() != "" {
+		parent = v3.ResourceRef_builder{
+			ResourceTypeId: pid.GetResourceType(),
+			ResourceId:     pid.GetResource(),


🟡 Suggestion: V2ResourceToV3 maps ResourceTypeId, ResourceId, DisplayName, Description, Parent, and Annotations — but v2.Resource also carries typed trait payloads (UserTrait, AppTrait, GroupTrait, any_trait). These are not present in v3.ResourceRecord, so they are silently discarded. V3ResourceToV2 then reconstructs a v2.Resource with no trait data. This is a data-loss regression relative to the SQLite engine for any downstream consumer that depends on user profile fields (email, status, manager) or app/group metadata. If this is intentional (traits stored separately), a comment or SDK changelog entry documenting the behavior change would help callers prepare.

ghost · 2026-05-25T02:36:24Z

+		if b.KeyCount() == 0 {
+			// Tiny excise-only: no SST to ingest but still need to
+			// clear destination. Pebble doesn't let us IngestAndExcise
+			// with zero SSTs (the ingest must reference at least one
+			// file), so we fall back to a RangeDelete + Flush.
+			_ = os.Remove(sstPath)
+			if err := dstDB.DeleteRange(plan.lower, plan.upper, pebble.Sync); err != nil {
+				return fmt.Errorf("compact: excise-only DeleteRange for %s: %w", plan.name, err)
+			}
+			continue
+		}
+
+		pending = append(pending, pendingIngest{
+			sstPath: sstPath,
+			exciseSpan: pebble.KeyRange{
+				Start: plan.lower,


🟡 Suggestion: The doc comment on Compact claims "The operation is atomic from base's perspective: a concurrent reader either sees only the old data or only the new data, never a mixture." This is true per-bucket (each IngestAndExcise is individually atomic), but the multi-bucket loop is not atomic as a whole — a failure or crash mid-loop leaves base with new data in some buckets and old data in others. Similarly, the DeleteRange calls for empty buckets (lines 128-138) are not covered by the later IngestAndExcise atomicity. Consider updating the doc comment to accurately describe the per-bucket atomicity, and noting whether callers are expected to handle partial-compaction recovery.

ghost · 2026-05-25T02:36:29Z

+			if err := proto.Unmarshal(oldVal, old); err == nil {
+				if err := e.deleteGrantIndexes(batch, idBytes, old); err != nil {
+					closer.Close()
+					return err
+				}
+			}
+			closer.Close()
+		case errors.Is(err, pebble.ErrNotFound):
+			// no-op
+		default:


🟡 Suggestion: When proto.Unmarshal(oldVal, old) fails (line 49), the code silently skips deleteGrantIndexes and proceeds to write the new primary record with new index entries. The old index entries are now orphaned — they reference the old record's indexed fields but the primary has been overwritten. This same pattern appears in resources.go and entitlements.go. While unmarshal failure requires data corruption (unlikely in normal operation), silently producing phantom index entries makes the corruption invisible and harder to diagnose. Consider returning an error or logging at warn level when unmarshal fails, rather than silently continuing.

ghost · 2026-05-25T02:36:34Z

+	return rec.GetContentType(), &bytesReader{b: rec.GetData()}, nil
+}
+
+// === read service surface ===
+//
+// ListGrants / ListResources / etc. The connectorstore.Reader
+// interface embeds these as gRPC ServiceServer interfaces; the
+// adapter implements the most-called paths directly and leaves the
+// rest to the embedded Unimplemented* stubs.
+
+// ListGrants returns all grants on the active sync, optionally
+// filtered by Resource (= principal) when req.Resource is set.
+func (a *Adapter) ListGrants(ctx context.Context, req *v2.GrantsServiceListGrantsRequest) (*v2.GrantsServiceListGrantsResponse, error) {
+	syncID := a.resolveActiveSync(req.GetActiveSyncId())
+	if syncID == "" {
+		return nil, ErrNoCurrentSync
+	}
+	out := make([]*v2.Grant, 0)
+	var iter func(yield func(*v3.GrantRecord) bool) error
+	if r := req.GetResource(); r != nil && r.GetId() != nil {
+		// Filter by principal.
+		iter = func(yield func(*v3.GrantRecord) bool) error {
+			return a.engine.IterateGrantsByPrincipal(ctx, syncID,
+				r.GetId().GetResourceType(), r.GetId().GetResource(), yield)
+		}


🟡 Suggestion: All List* methods (ListGrants, ListResources, ListResourceTypes, ListEntitlements) materialize the entire result set in memory with make([]*v2.X, 0) and unbounded append. For large syncs (millions of grants), this can produce hundreds-of-MB allocations in a single call. The v2.GrantsServiceListGrantsRequest proto has page_size and page_token fields that are ignored here — callers passing those fields expecting bounded results will get the full set instead. Consider supporting pagination or at minimum documenting the unbounded behavior so callers can plan accordingly.

ghost · 2026-05-25T02:37:09Z

General PR Review: Storage Engine v4 — combined squash (for macro review)

Blocking Issues: 0 | Suggestions: 1 | Threads Resolved: 0
Review mode: incremental since 1627b04
View review run

Review Summary

The new commits add docs/rfcs/0004-storage-engine-v4/autoresearch-pebble-perf.md, a design document for an autonomous performance experiment loop targeting the Pebble storage engine. No code, tests, or dependencies changed. No new issues found. The previously noted bare scope blocks in the Pebble engine remain unaddressed.

Security Issues

None found.

Correctness Issues

None found.

Suggestions

pkg/dotc1z/engine/pebble/grants.go:141 and pkg/dotc1z/engine/pebble/resources.go:122 — Bare { scope blocks in DeleteGrantRecord and DeleteResourceRecord are refactoring artifacts from inverting the old if err == nil guard; they serve no purpose and should be removed. (carried forward from prior review)

ghost

Blocking issues found — see review comments.

Diagnosis: PutGrants(N grants) was calling Engine.PutGrantRecord per grant. Each PutGrantRecord did a Get (read-before-write index cleanup) + new Batch + Set primary + Set indexes + Commit(pebble.Sync). With default pebble.Sync, N grants = N fsyncs + N batch alloc/close cycles. At 1k/10k grants the per-grant cost was 50-58 µs vs SQLite's 18-30 µs (SQLite chunks an INSERT into one transaction = ~1 fsync per chunk). Fix: two changes that match SQLite's PRAGMA synchronous=NORMAL + chunked-transaction semantics: 1. **Fat batch per Put* call (engine)**. New PutGrantRecords / PutResourceRecords / PutEntitlementRecords / PutResourceTypeRecords variadic methods write N records in a single pebble.Batch, committing once. The single-record methods now wrap the bulk ones with one element. 2. **Fresh-sync fast path (engine + adapter)**. Engine tracks a `freshSync` flag set by MarkFreshSync (called from Adapter.StartNewSync) and cleared by EndFreshSync (called from Adapter.EndSync). While freshSync is true: - batch.Commit uses pebble.NoSync (OS-buffered, no fsync per commit), matching SQLite's synchronous=NORMAL during sync. - The read-before-write Get for index cleanup is skipped because the sync_id is guaranteed to have no prior records. EndFreshSync does one db.Flush + db.LogData(nil, pebble.Sync) so all writes are on disk before the caller returns. SetCurrentSync (and ResumeSync, which calls it) does NOT flip the freshSync flag — a resumed sync may already have data for the sync_id, so writes keep fsync + read-before-write. Crash semantics: identical to SQLite's. A host crash mid-sync forces the connector to re-sync (the sync_run record stays in "started" state); we never silently surface a partial sync as committed. Bench results (-benchtime=2x, Linux arm64, 16 cores): Workload | SQLite | Pebble | Pebble vs SQLite -----------------|----------|----------|------------------ WritePack 100 | 14.2 ms | 7.4 ms | 1.93x faster WritePack 1k | 30.2 ms | 10.7 ms | 2.82x faster WritePack 10k | 218.6 ms | 42.9 ms | 5.10x faster WritePack 100k | 3.17 s | 352 ms | 9.00x faster WritePack 1M | 55.59 s | 3.78 s | 14.72x faster Read 100 | 2.14 ms | 2.20 ms | parity Read 1k | 6.04 ms | 3.93 ms | 1.54x faster Read 10k | 43.2 ms | 16.0 ms | 2.71x faster Per-grant write cost flatlines for Pebble at ~3.8 µs after enough amortization; SQLite climbs to 56 µs at 1M grants (B-tree depth + index maintenance). Note: at 100k SQLite's ListGrants honored its default 10k page_size and returned only 10k records — the bench surfaces that the adapter's List* methods don't paginate today, which is a separate review-thread item. production_bench_test.go now sweeps 100, 1k, 10k, 100k, 1M (override via BATONSDK_BENCH_SCALES env var; -short cuts to 100/1k/10k). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

c1-squire-dev · 2026-05-25T03:03:12Z

Diagnosis + fix for the WritePack-slower-than-SQLite finding from the previous bench.

Diagnosis: Per-grant pebble.Sync fsync was dominating. PutGrants(N) called PutGrantRecord per grant → N batches × N fsyncs vs SQLite's chunked-INSERT-in-one-transaction (1 fsync per chunk).

Fix (commit 72881f6, matches SQLite's PRAGMA synchronous=NORMAL semantics):

Fat batch per Put* call — variadic PutGrantRecords/PutResourceRecords/PutEntitlementRecords/PutResourceTypeRecords build one pebble.Batch, commit once.
Fresh-sync pebble.NoSync write path between Adapter.StartNewSync and Adapter.EndSync. Adapter calls new Engine.MarkFreshSync (StartNewSync) and Engine.EndFreshSync (EndSync). EndFreshSync does one Flush + WAL fsync to harden everything. Skip read-before-write Get for index cleanup during fresh-sync (sync_id is guaranteed empty).
Crash semantics: identical to SQLite synchronous=NORMAL. A host crash mid-sync leaves the sync_run in "started" state, forces the connector to replay.

Scale results (the question that matters as you scale grant count):

Grants	SQLite	Pebble	Pebble vs SQLite
100	14.2 ms	7.4 ms	1.93× faster
1,000	30.2 ms	10.7 ms	2.82× faster
10,000	218.6 ms	42.9 ms	5.10× faster
100,000	3.17 s	352 ms	9.00× faster
1,000,000	55.59 s	3.78 s	14.72× faster

Per-grant cost: SQLite climbs to 56 µs at 1M (B-tree depth + index maintenance); Pebble flatlines at ~3.8 µs after amortization. At 1M grants Pebble writes the c1z in 3.8 s vs SQLite's 55.6 s.

PR body has the full table + read benchmarks + the run command.

ghost

No blocking issues found.

The previous List*/ListGrants/ListResources/etc all materialized the full result set in memory and ignored page_size + page_token. SQLite silently clamped to 10k (default page) so the two engines weren't even returning the same record count at scale — apples to oranges. Matches SQLite's pagination contract: - page_size == 0 || page_size > MaxPageSize → DefaultPageSize (10000) - page_token is opaque to the caller; pass next_page_token back verbatim - filter + paginate compose (e.g. by-principal + page through 100 grants) - malformed token → ErrInvalidPageToken (so a corrupted token doesn't silently restart from the beginning) New file: pkg/dotc1z/engine/pebble/paginate.go - encodeCursor / decodeCursor — base64.RawURLEncoding of raw pebble key - rangeAfter — turns a cursor into a pebble.IterOptions LowerBound that's strictly greater than the cursor (cursor + 0x00) - clampPageSize — SQLite-equivalent clamp - iteratePrimaryPageWithKey — generic helper that captures the LAST RETURNED key (not the first non-returned) so cursors are stable - PaginateGrantsBySync, PaginateGrantsByEntitlement, PaginateGrantsByPrincipal - PaginateResourcesBySync, PaginateResourcesByParent - PaginateResourceTypesBySync - PaginateEntitlementsBySync, PaginateEntitlementsByResource Adapter rewrite: each List* method now calls the matching Paginate* helper, passing page_size + page_token from the request and returning the engine's next_page_token in the response. ListResources has a small over-fetch (4x cap) when resource_type_id post-filter is set, so a sparse hit rate doesn't force a tail of round-trips. Engine sentinel: ErrInvalidPageToken added to engine_stub.go alongside the rest. Tests (paginate_test.go, 11 cases all green): - TestListGrantsPagination: page sizes 10, 50, 100, 250, 251, default(0) against 250 grants — counts distinct IDs, detects duplicates, asserts no exceed of page_size, verifies last page has empty token - TestListGrantsPaginationByPrincipal: by_principal index pagination, page_size=15 over 100 grants - TestListResourcesPagination: 75 resources, page_size=20 - TestListEntitlementsPagination: 50 entitlements, page_size=7 - TestListResourceTypesPagination: 30 RTs, page_size=8 - TestPageTokenMalformed: garbage token surfaces ErrInvalidPageToken - TestPaginationClampedPageSize: 0 + MaxPageSize+1 both clamp Bench harness fix: benchmarkRegisteredUnpackReadGrants now walks pages until next_page_token is empty so the two engines compare apples-to-apples at >10k. Bench results with pagination working: Grants SQLite Pebble Pebble vs SQLite 100 2.13 ms 2.17 ms parity 1k 5.97 ms 3.73 ms 1.60x faster 10k 42.8 ms 15.6 ms 2.75x faster 100k 386 ms 142 ms 2.72x faster Read scaling holds at 2.7-2.8x faster. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

c1-squire-dev · 2026-05-25T03:11:56Z

Pagination wired (commit de17b96). Honest answer to "why don't we paginate": I built the adapter's List* as an MVP and never wired `page_size` / `page_token` through. There's no architectural reason — Pebble iterators are well-suited to it.

Matches SQLite's contract:

`page_size == 0` or `page_size > MaxPageSize` → `DefaultPageSize` (10000)
`page_token` opaque (base64 of last pebble key); pass `next_page_token` back verbatim
Filter + paginate compose (e.g. by-principal + page through 100 grants)
Malformed token → `ErrInvalidPageToken` so corruption doesn't silently restart from the beginning

New file: `pkg/dotc1z/engine/pebble/paginate.go` with `Paginate*` engine helpers and an `iteratePrimaryPageWithKey` generic that captures the LAST RETURNED key (not the first non-returned) so cursors are stable.

Adapter rewrite: every `List*` now calls the matching `Paginate*` helper, passes `page_size` + `page_token` from the request, returns `next_page_token` in the response. `ListResources` over-fetches up to 4× when `resource_type_id` post-filter is set so a sparse hit rate doesn't force a tail of round-trips.

Tests (11 cases, all green): page sizes 10/50/100/250/251/default(0) against 250 grants; by-principal index pagination over 100 grants; resources/entitlements/resource_types pagination; malformed-token errors; engine-level page_size clamping.

Bench harness fix: `benchmarkRegisteredUnpackReadGrants` now walks pages until `next_page_token` is empty so the two engines compare apples-to-apples at >10k.

Apples-to-apples read results:

Grants	SQLite	Pebble	Pebble vs SQLite
100	2.13 ms	2.17 ms	parity
1k	5.97 ms	3.73 ms	1.60× faster
10k	42.8 ms	15.6 ms	2.75× faster
100k	386 ms	142 ms	2.72× faster

PR description is updated with the same numbers.

ghost · 2026-05-25T03:16:42Z

+	rtFilter := req.GetResourceTypeId()
+
+	out := make([]*v2.Resource, 0, limit)
+	var nextCursor string
+	for len(out) < limit {
+		pageLimit := limit - len(out)
+		// Over-fetch a little when post-filtering so a sparse hit rate
+		// doesn't force a tail of extra round-trips. 4x is the cap; if
+		// rtFilter is empty we skip the over-fetch entirely.
+		fetchLimit := pageLimit
+		if rtFilter != "" {
+			fetchLimit = pageLimit * 4
+			if fetchLimit > MaxPageSize {
+				fetchLimit = MaxPageSize
+			}
+		}
+		var records []*v3.ResourceRecord
+		var err error


🟡 Suggestion: When rtFilter is set and the inner loop hits len(out) == limit before consuming all records, the remaining records in the engine batch are never examined. The returned nextCursor (from the engine) skips past them, so matching records in the unexamined tail are silently dropped. Consider removing the early break and instead continuing to iterate (without appending past limit) so the engine's cursor is consumed correctly, or track a per-record cursor to resume from the exact position.

ghost

No blocking issues found.

…s (E) Production-readiness work. The previous combined branch shipped the engine + write-side adapter, but the syncer and grant expander hard- depend on a set of reader gRPC methods + write helpers that returned codes.Unimplemented or weren't on the adapter at all. Dropping the syncer onto the adapter would fail at the first lookup. This commit closes that gap. ## (A) Syncer-blocking gRPC reader methods New file: pkg/dotc1z/engine/pebble/adapter_reader.go - GetEntitlement (reader_v2.EntitlementsReaderServiceServer) - GetResource (reader_v2.ResourcesReaderServiceServer) - GetResourceType (reader_v2.ResourceTypesReaderServiceServer) - ListGrantsForEntitlement (reader_v2.GrantsReaderServiceServer) - ListGrantsForResourceType (reader_v2.GrantsReaderServiceServer) - GetSync, ListSyncs, GetLatestFinishedSync (reader_v2.SyncsReaderServiceServer) ListGrantsForEntitlement uses the by_entitlement index + post-filters by principal_id / principal_resource_type_ids (4x over-fetch cap when filtered to amortize sparse hit rates). ListGrantsForResourceType walks the primary range + post-filters; a future idxGrantByPrincipalResourceType index would make it index-driven. GetLatestFinishedSync is a linear scan over sync_runs today (one rec per sync; tenants have O(K) total). If hot, a (typeFinishedSyncByType | type) -> sync_id sidecar makes it O(1). v3SyncRunToV2 + v3SyncTypeToString bridge the v3 SyncRunRecord shape to the reader_v2.SyncRun gRPC shape. ## (B) GrantStore sub-store + StoreExpandedGrants New file: pkg/dotc1z/engine/pebble/adapter_grants_store.go Adapter.Grants() returns a dotc1z.GrantStore implementation: - StoreExpandedGrants: delegates to PutGrants. Pebble doesn't re-extract expansion metadata at write time, so there's nothing to strip. - PendingExpansionPage / PendingExpansion: empty until Stack 6 wires expansion metadata extraction. Returning empty is the safe no-op — the syncer's ExpandGrants loop terminates and the sync proceeds. - ListWithAnnotationsPage / ListWithAnnotationsForResourcePage: return grants with nil annotation (no expansion in MVP). - ListWithAnnotations: iter.Seq2 wrapper around the Page method, matches the SQLite iteration contract (yields (zero, err) on error). Caveat: full grant expansion is RFC Stack 6, explicitly deferred. Connectors that don't use grant expansion (most of them) work today end-to-end; tenants needing expansion must wait for Stack 6. ## (D) IfNewer upserts (partial-sync semantics) New file: pkg/dotc1z/engine/pebble/if_newer.go - PutGrantRecordsIfNewer, PutResourceRecordsIfNewer, PutEntitlementRecordsIfNewer, PutResourceTypeRecordsIfNewer Mirror SQLite's EXCLUDED.discovered_at > X.discovered_at semantics: read existing, compare timestamp, only write if strictly newer. Index cleanup runs only on the records that pass the freshness check. Single batch + commit per call. The freshSync fast path is disabled by design here (IfNewer is by definition not a fresh sync). Adapter wires in adapter_grants_store.go: PutGrantsIfNewer, PutResourcesIfNewer, PutEntitlementsIfNewer, PutResourceTypesIfNewer. ## (E) Stats / GrantStats / OutputFilepath New file: pkg/dotc1z/engine/pebble/adapter_stats.go - Stats(syncType, syncID) -> {table: count} — exact counts via per-record-type iteration. Matches the SQLite shape used by pkg/sync/progresslog. - GrantStats(syncType, syncID) -> {ent_resource_type: count} for the per-RT progress lines. - OutputFilepath() -> engine's dbDir. SQLite returns the .c1z file path; Pebble doesn't have a single file so we return the directory. - Engine.DBDir() exported for the adapter. ## (F) Verified — no trait data loss The earlier review-bot finding about V2ResourceToV3 silently dropping typed trait payloads (UserTrait/GroupTrait/AppTrait/etc) is incorrect: v2 traits are packed into Resource.Annotations as Any (confirmed by inspection of pkg/types/resource/user_trait.go and friends), and our translate_v2.go already preserves Annotations. Added a NOTE comment to ResourceRecord in the proto so future readers don't re-relitigate. ## Tests (5 new test functions, all green) - TestGetEntitlementResourceResourceType — all three Get methods + missing-entity ErrNotFound - TestListGrantsForEntitlementAndResourceType — full list + RT filter + RT-only list semantics - TestSyncsReaderMethods — GetSync, ListSyncs, GetLatestFinishedSync with sync_type filter - TestStatsAndGrantStats — exact counts via iteration - TestIfNewerSkipsStaleGrants — older = no-op; newer = overwrite; equal-timestamp = no-op (strictly newer required, matches SQLite) Total engine tests now: 30 (up from ~25). 0 lint issues across the pebble surface. ## Still gap (from the audit, addressed in follow-ups) - (C) Diff-sync subsystem — GenerateSyncDiff. Cross-engine compactor is rollup, not diff. - (G) Remaining reviewer-bot findings (orphan-index-on-unmarshal- fail, multi-bucket compaction atomicity wording, etc.) - (H) CI repo-wide lint findings (pre-existing, outside our PR). - (J) Stack 6 (deferred grant expansion) + Stack 7 (C1 platform integration) per RFC. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… cache leak, Fatalf Five reviewer findings from PR #874 addressed: 1. **Orphan index entries on unmarshal-fail** — grants.go, resources.go, entitlements.go silently skipped index cleanup when proto.Unmarshal of the previous record failed during read-before-write. New behavior: surface the unmarshal error with the affected key in the message. Recovery is `DeleteX(key)` for the corrupt row. 2. **Cache leak on pebble.Open failure** — engine.go.Open minted a pebble.Cache (per preset; up to 4 GiB for PresetBackendInfraHot) then returned without Unref on failure. Now Unrefs the minted cache on the failure path; shared caches stay caller-owned. 3. **Fatalf via panic instead of os.Exit** — discardPebbleLogger was `panic(...)` on Fatalf. A recover() in a gRPC interceptor or HTTP framework could swallow it and let the program continue on a corrupted engine. Now writes to stderr and `os.Exit(1)`, matching Pebble's default-logger semantics. 4. **Errorf silently dropped** — discardPebbleLogger.Errorf was a no-op. Pebble uses Errorf for compaction errors and WAL recovery warnings (operationally significant). Now forwards to stderr. 5. **Multi-bucket compaction atomicity** — compactor.go.Compact docstring claimed whole-call atomicity, but the multi-bucket IngestAndExcise loop is per-bucket atomic only. Doc updated to accurately describe the contract + recovery model (re-run is idempotent since excise+ingest of the same SSTs converges). Tests still green: 30 engine + 5 compactor + 4 equivalence + 1 tuple property + 4 codec benchmarks; 0 lint issues. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost · 2026-05-25T03:37:22Z

+
+// GrantStats returns just the grant count, partitioned by entitlement
+// resource_type_id. Used by progresslog to show per-RT progress. Like
+// Stats, this is an exact count via iteration.
+func (a *Adapter) GrantStats(ctx context.Context, syncType connectorstore.SyncType, syncID string) (map[string]int64, error) {
+	if syncID == "" {
+		syncID = a.currentSyncID()
+	}
+	if syncID == "" {
+		return map[string]int64{}, nil
+	}
+	counts := map[string]int64{}
+	if err := a.engine.IterateGrantsBySync(ctx, syncID, func(rec *v3.GrantRecord) bool {
+		rt := rec.GetEntitlement().GetResourceTypeId()
+		counts[rt]++
+		return true


🟠 Bug: GrantStats accepts a syncType parameter but never validates it against the sync run's type. The SQLite C1File.GrantStats (pkg/dotc1z/c1file.go:961) checks syncType != SyncTypeAny && syncType != lastSync.SyncType and returns an error on mismatch. The Pebble implementation silently returns counts for any sync regardless of syncType, breaking the contract.

ghost · 2026-05-25T03:37:31Z

+		}
+		cursor = nextCursor
+	}
+	return reader_v2.GrantsReaderServiceListGrantsForEntitlementResponse_builder{


🟡 Suggestion: ListGrantsForEntitlement (lines 110-152) and ListGrantsForResourceType (lines 163-209) have the same post-filter cursor skip pattern as the previously-flagged ListResources issue. When filters are active (principalID/rtSet/rtFilter) and the inner loop hits len(out) == limit before consuming all records in the engine batch, nextCursor is set to the engine's page cursor (past the entire batch). Unexamined matching records in the tail of the batch are silently dropped on the next pagination call.

ghost

Blocking issues found — see review comments.

…nc skip) Mutation/error-handling audit triggered by review: The previous freshSync write path skipped the read-before-write Get on the assumption that a fresh sync's sync_id is empty. That's correct for the FIRST write to each external_id, but breaks when a connector emits the same external_id twice within one fresh sync (common in paginated sources, or just a connector bug). The second write would overwrite the primary but leave the OLD by_entitlement and by_principal index entries pointing at stale indexed fields — orphan entries that IterateGrantsByEntitlement / IterateGrantsByPrincipal would dereference to the *new* primary, producing phantom-filter results. Fix: read-before-write runs unconditionally on all PutXRecords paths (grants, resources, entitlements). The Get cost is small in the fresh-sync hot phase (mostly a memtable hit) and the integrity gain is non-negotiable. Fresh-sync still uses pebble.NoSync for the batch commit — that's the larger perf win and remains intact. Bench impact at 1M grants: Before (unsafe): 3.78 s (14.7x faster than SQLite) After (safe): 4.24 s (12.9x faster than SQLite) ~12% slower than the unsafe path, still well above the SQLite baseline. Also fixed the same swallow pattern in paginate.go: the index-walk Get on a missing primary now distinguishes ErrNotFound (orphan from a concurrent delete-then-iter race; skip and continue, fsck reconciles) from real errors (I/O, corrupted SST; surface via return). New test coverage: - TestPutGrantRecordOverwriteCleansIndexes - TestPutResourceRecordOverwriteCleansIndexes - TestPutEntitlementRecordOverwriteCleansIndexes - TestFreshSyncDuplicateExternalIDCleansIndexes — exercises the fresh-sync path with a duplicate external_id and asserts the OLD index entries are cleaned (the bug class fixed here). - TestNonFreshSyncOverwriteWorksThroughAdapter — adapter path after EndSync + SetCurrentSync. - TestDeleteThenPutCleansAndRewrites — Delete + Put returns to a clean state. All 6 mutation tests + all existing tests pass. 0 lint issues across the engine + compactor surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost · 2026-05-25T03:48:49Z

+			closer.Close()
+			return fmt.Errorf("DeleteGrantRecord: unmarshal old %q: %w", externalID, err)
+		}
+		{


🟡 Suggestion: This bare { block is a refactoring artifact from inverting the old if err == nil { guard. The scope serves no purpose and makes the code look like it belongs to a control-flow structure. Same pattern appears in resources.go:122. Remove the bare braces from both files.

ghost

No blocking issues found.

Root cause: ci.yaml and main.yaml used `version: latest` for golangci-lint, so each CI run pulled whatever version was current. v2.12.2 added new gosec rules (G702 command injection, G118 goroutine background context, G122 filepath.Walk TOCTOU, G703 path traversal, G704 SSRF) that didn't exist in earlier versions. PRs that were lint-clean against v2.9.0 broke against v2.12.2. Additionally, an earlier auto-fix commit (2659a4e) removed several //nolint:gosec directives that were "unused for linter gosec" under the OLDER linter version, mistaking obsolete-rule-name suppressions for dead code. v2.12.2 still needs those directives because the NEW gosec rules cover the same patterns under different rule names. Systematic fix: 1. Pin golangci-lint to v2.12.2 in both ci.yaml and main.yaml. The pin comment explains the trade-off and the recommended cadence for bumps. 2. Restore the nolint:gosec directives the auto-fix removed: - internal/connector/connector.go (exec.CommandContext) - pkg/actions/actions.go (intentional background goroutines, 2x) - pkg/dotc1z/clone_sync_bench_test.go (os.WriteFile under TempDir) - pkg/dotc1z/race_simulation_test.go (os.WriteFile, 2x) - pkg/lambda/grpc/config/config.go (os.ReadFile of operator cert) - pkg/sync/expand/expand_benchmark_test.go (os.WriteFile in temp) - pkg/sync/expand/expand_correctness_test.go (os.WriteFile in temp) - pkg/uhttp/wrapper.go (HttpClient.Do with arbitrary endpoints) 3. New nolint for the v4 PR's surface code: - pkg/dotc1z/format/v3/envelope.go (filepath.Walk over a Pebble checkpoint dir we own — no symlink TOCTOU exposure) 4. Remove now-unused nolints introduced earlier in this PR: - pkg/dotc1z/engine/equivalence/equivalence_test.go (G602 gone) - pkg/tasks/c1api/task_queue.go (G115 not firing in v2.12.2) - pkg/types/types.go (revive var-naming no longer fires there) Validation: golangci-lint v2.12.2 reports 0 issues across the whole repo with both default and batonsdkv2 build tags. `go test -tags=baton_lambda_support -short ./...` passes (matches the Windows CI command). Related cases addressed proactively: - Other repos using `version: latest` for golangci-lint or buf will hit the same drift; the comment in both workflows now flags the pattern for future maintainers. - The 5 new gosec rules (G702/G118/G122/G703/G704) are now accounted for at every callsite in the SDK. Future Add of new callsites needs to consider these rules at PR time. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost

No blocking issues found.

Same-PR squash inadvertently bumped go.mod from 1.25.2 → 1.25.3. CI's setup-go installs 1.25.2; the Go toolchain auto-downloads 1.25.3 to satisfy the go.mod directive, then conflicts because the installed std-library is 1.25.2: compile: version "go1.25.3" does not match go tool version "go1.25.2" Fix: align go.mod with CI. Future bumps of either go.mod or CI's go-version should land in the same commit. Related-case audit: - No `toolchain` directive in go.mod (would have caused the same issue independently). - Both ci.yaml and main.yaml pin go-version explicitly to 1.25.2; no `latest` drift here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost

No blocking issues found.

…ution CI Windows job failed TestC1ZCachedViewSyncRunInvalidation. Root cause is a deterministic-ordering bug in pkg/dotc1z/sync_runs.go: q.Order(goqu.C("ended_at").Desc()) When two sync_runs end in quick succession, Windows' coarser time.Now() resolution can produce identical ended_at strings. The LIMIT 1 then returns whichever row SQLite happens to pick — the test expects the latest finished sync but sometimes gets the older one. Fix: tiebreak on sync_id DESC. sync_ids are KSUIDs (sortable by embedded timestamp), so the lexicographically-greater sync_id is the later one. Pre-existing bug in main (the test was added in PR #612 by geoff@greer.fm); the Windows flake just didn't hit on main's most recent CI run. Related-case audit (other places ORDER BY ended_at might tie): - getFinishedSync ✓ fixed here. - LatestSyncID etc. that walk sync_runs use ORDER BY id when no timestamp tiebreaker is needed — no change. - The Pebble engine's IterateAllSyncRuns scans by sync_id natively, no ordering bug to fix there. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost

No blocking issues found.

Standalone design doc: answers every question the autoresearch-create skill asks, plus a concrete walkthrough of how each artifact (autoresearch.md, autoresearch.sh, autoresearch.checks.sh, autoresearch.config.json) would look for this codebase. Not a runbook to start the loop today — that's a separate /autoresearch invocation. This file just makes the loop reproducible: anyone with commit access can copy the snippets into a fresh worktree and kick it off. Sections: 1. Goal — current state + which scales have headroom 2. Primary + secondary metrics (1M WritePack ms as primary) 3. Files in scope — every tunable 4. Off limits — proto wire format, SQLite path, generated code 5. Constraints — engine + SQLite tests + lint + no new deps 6. autoresearch.sh — full bench script with structured METRIC output 7. autoresearch.checks.sh — correctness gate 8. autoresearch.config.json — 200 iterations, ~5-6 min each 9. What's been tried — fat-batch + fresh-sync NoSync (status quo); dead ends (DisableWAL); open ideas (memtable size, L0 thresholds, compaction concurrency, bloom filters, mixed compression presets, codegen codec, tuple-encoder pooling) 10. Validation of existing tests — per-file verdict for engine, adapter, paginate, mutation, compactor, equivalence, envelope, SQLite regression sentinel 11. Stop conditions 12. Resume protocol 13. Out-of-scope items 14. Commit-message + dashboard expectations for the agent The "What's been tried" + "Validation of existing tests" sections mean a fresh agent can resume the loop without re-discovering the ~12.95x at 1M write delta we already shipped, or guessing whether a keep/discard decision was actually correctness-gated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ghost

No blocking issues found.

c1-squire-dev Bot requested a review from a team May 24, 2026 21:12

ghost reviewed May 24, 2026

View reviewed changes

Comment thread go.mod Outdated

ghost reviewed May 24, 2026

View reviewed changes

Comment thread pkg/dotc1z/engine/pebble/engine.go Outdated

ghost reviewed May 24, 2026

View reviewed changes

Comment thread pkg/dotc1z/engine/pebble/engine.go Outdated

ghost reviewed May 24, 2026

View reviewed changes

Comment thread pkg/dotc1z/format/v3/envelope.go Outdated

ghost reviewed May 24, 2026

View reviewed changes

Comment thread pkg/dotc1z/format/v3/envelope.go Outdated

ghost reviewed May 24, 2026

View reviewed changes

Comment thread pkg/dotc1z/engine/pebble/grants.go Outdated

ghost requested changes May 24, 2026

View reviewed changes