Migrate duplicate-base, record-read, lookup-search-index onto lifecycle drivers#64
Merged
Merged
Conversation
duplicate-base becomes the second member of duplicate-lifecycle.ts (after duplicate-table), delegating via a thin spec: prepare a populated source base (its own "prepare" measurement parked on the fixture), assert readiness, run one measured base operation (duplicate / duplicate-stream / export-stream via executeBaseOperation), then drop the created copy and the source unless it is a reusable cached seed. The shared driver is byte-unchanged, so duplicate-table is unaffected and needs no re-verification. buildResult plus the seed/verify/cleanup helpers are reused unchanged, so the artifact is byte-for-byte equivalent — G1 clean over both duplicate-base cases and export-base/...-stream, each on v1 and v2. diff-artifacts.mjs masks duplicate-base's run-to-run-volatile generated values (each proven volatile by the baseline A vs B noise check): the created copy / export base id and name (baseId, baseName), the duplicated main table id echoed as linkFieldForeignTableId, and the export preview URL + hash file name; plus the hash-derived seedBaseName under details.sourceBase.cache (seedHash-family, present only in G1 after the migration changes the seed code hash). Co-Authored-By: Claude <noreply@anthropic.com>
record-read is the first member of a new read-lifecycle.ts driver: seed (or restore) a host table plus the source table its lookups read through, assert the full 50-field projection is readable, run the measured paged getRecords scan (optionally versus a no-query baseline for the overhead variant) and verify it, then drop the host + source tables unless they are a reusable cached seed. The read family's signature — and what makes this its own driver rather than a copy of duplicate-lifecycle — is that the measured read is non-destructive: it creates nothing to clean up, so the driver OWNS the cleanup policy (drop the seed tables the fixture declares, only when they are not a reusable cached seed and the execute DB is not the throwaway isolated copy). The runner just declares seedTableIds + isReusableSeed and writes no cleanup boilerplate. seedReady is computed outside the diagnostic try (a readiness failure throws raw, as before), and the optional baseline + measured scan + verify live entirely in the opaque runPrimary, so buildResult and all routing/verification evidence are reused unchanged — G1 byte-equivalent over both record-read cases on v1 and v2. diff-artifacts.mjs masks details.queryVariant.overheadRatio, the queryMs / baselineMs timing quotient that varies run-to-run on unchanged code (proven by the record-read baseline A vs B diff); the *Ms timings and threshold-metric value are already masked. No seedHash mask is needed: record-read nests its seed-cache key under details.seed.cache, already covered by the cache rule. Co-Authored-By: Claude <noreply@anthropic.com>
lookup-search-index becomes the second member of read-lifecycle.ts (after record-read): it measures global aggregation/search-index reads over a seeded source + dual host (index-off / index-on) table set. It rides the same driver — seed (or restore) the read fixture, assert readiness, run the measured read workload, and (per the driver's non-destructive read cleanup policy) drop nothing because the seed is always a reusable cached seed, matching the pre-migration runner which had no cleanup at all. Two member-specific shapes ride in the spec: prepare carries its per-stage seed sub-measurements on the fixture and emits no "prepare" phase, and the measured primary is a keyword x sample loop whose p95 is the threshold metric, expressed entirely in the opaque runPrimary. buildResult is reused unchanged, so the artifact is byte-for-byte equivalent — G1 clean over both search cases on v1 and v2. Having a real second member proves the read driver generic across the family. diff-artifacts.mjs masks: the per-keyword summarizeDurations maxMs (a timing value, scoped to details.keywords.* so the threshold maxMs stays visible; proven volatile by the baseline A vs B diff); and, present only in G1, the index-off / index-on host table + view ids and the bare details.seedCache seedHash family (emitted spread, not nested under a `cache` object, so the existing cache rule does not reach it). Co-Authored-By: Claude <noreply@anthropic.com>
duplicate-base (duplicate-lifecycle 2nd member), record-read (read-lifecycle 1st member) and lookup-search-index (read-lifecycle 2nd member) move to Migrated; the read-lifecycle.ts driver is new this round. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrates three more legacy runner kinds onto lifecycle drivers, taking the
tracker from 26/35 → 29/35 runner kinds (36/55 → 43/55 cases). One
self-contained commit per runner (incl. the diff-artifacts masks it needs).
Driver decisions (reuse → extend → new)
duplicate-lifecycle.ts, driver byte-unchangedrunPrimary. Because the shared driver is untouched, duplicate-table is unaffected and was not re-verified.read-lifecycle.ts(1st member)duplicate-lifecycle, which always drops a created copy.read-lifecycle.tsrunPrimary, seed always reusable so cleanup drops nothing. A real 2nd member proves the new driver generic.Every migrated runner reuses its unchanged
buildResultand seed/verifyhelpers, so the artifact is byte-for-byte equivalent (G1).
G1 artifact equivalence — baseline (legacy) ↔ candidate (migrated), per case × engine
Local methodology: pinned
teable-ee, seed cache on, all measured runs are cache hits (baseline seeds warm; candidate seeds warmed once before measuring) so there is no build-vs-restore asymmetry. Baseline A↔B clean (14/14) before editing; baseline↔candidate clean (14/14) after.All
result=pass,error=null. Routing preserved: duplicate-base & record-read assert routing (routeMatched=true; duplicate-base v1x-teable-v2=false/ v2=true); export-base & lookup-search-index assert no routing, as before. traceRefCount preserved vs baseline for every case×engine (28/28, 1/1, 10/10, 20/20, 270/270; locallysavedTraceCount=0with no Jaeger — CI is where saved==ref / failed=0 holds).Negative tests (comparator teeth), per runner
For duplicate-base, record-read and lookup-search-index:
details.duplicate.operation/details.operation/details.tableIndexMode)baseId/queryVariant.overheadRatio/seedCache.seedHash)details.duplicate.status/queryVariant.config.filterFieldName/seedCache.seedNamePrefix)9/9 as expected. Mask necessity also confirmed: with the pre-edit diff script, baseline A↔B fails on exactly the fields the new rules cover (and record-read pages surfaces nothing, correctly needing no new mask).
Mask deltas (
scripts/diff-artifacts.mjs)baseId(GENERATED_ID_KEYS),baseName(GENERATED_NAME_KEYS),linkFieldForeignTableId,details.duplicate.exportResult/doneEventpreviewUrl/fileName/id/name (run-to-run echoes of the created copy / export);seedBaseNameadded to the existingcacherule (seedHash-family, G1-only).details.queryVariant.overheadRatio(timing ratio). No seedHash mask needed — nests underdetails.seed.cache, already covered.off/on TableId/ViewId(GENERATED_ID_KEYS),details.keywords.*.summary.maxMs(timing), and the baredetails.seedCacheseedHash family (not nested incache).Each mask carries a justifying comment; the volatility ones are proven by the baseline A↔B noise check, the seedHash-family ones by absent-in-A↔B / present-in-G1.
🤖 Generated with Claude Code