feat: 写路径多库化(sync/bucket switch)— 适配 Codex 26.609,PR2#17
Closed
Wangnov wants to merge 2 commits into
Closed
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
This was referenced Jun 13, 2026
078dd5f to
bb50bfa
Compare
Wangnov
added a commit
that referenced
this pull request
Jun 13, 2026
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
bb50bfa to
abcee40
Compare
Wangnov
added a commit
that referenced
this pull request
Jun 13, 2026
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
abcee40 to
fcffc5e
Compare
Wangnov
added a commit
that referenced
this pull request
Jun 13, 2026
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
fcffc5e to
64f5c8c
Compare
Wangnov
added a commit
that referenced
this pull request
Jun 13, 2026
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
PR2 of the multi-store adaptation (builds on #16). The one-shot write commands now reconcile every discovered store plus the shared rollout JSONL, instead of a single resolved DB. - sync.rs: reconcile_all_stores_with_backup discovers all stores, rewrites the shared rollout JSONL once (deduped by canonical path, before any DB row is flipped), then backs up and reconciles each store's SQLite. A per-store failure is reported (StoreOutcome::Failed) without aborting healthy stores. - A store whose rollout targets cannot be read is marked Failed and its DB is left untouched, so a DB is never flipped while its rollouts stay stale. - Backups are namespaced per store: <db_parent>/backups/<slug>/. - main returns ExitCode: Full -> 0, Partial -> 2, Failed -> 1. - --sqlite-only warns that App-store SQLite-only edits may be reverted by Codex's rollout backfill (rollout JSONL is the source of truth). - watch still uses the single-store path (MismatchedRows followup); a debug_assert guards the multi-store path against MismatchedRows misuse. - Remove the now-unused single-store backup helpers. Reviewed in parallel by Codex and a Claude subagent. Codex caught a real consistency hole (a rollout-collection failure was swallowed while the DB write could still succeed, reporting Full while the rollout stayed stale); fixed by failing such stores and skipping their DB write, with a regression test. 50 tests, clippy and fmt clean.
64f5c8c to
268baf2
Compare
Wangnov
added a commit
that referenced
this pull request
Jun 13, 2026
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
PR3 of the multi-store adaptation (builds on #17). Before writing, the multi-store path now waits (bounded) for each store's Codex startup-backfill to finish, so threadripper never races Codex's rebuild. - wait_for_store_backfill polls backfill_state (read-only): no table or a `complete` status is ready immediately; otherwise it waits up to backfill_wait (default 10s) and then reports the store busy. - A busy store is reported StoreOutcome::Skipped, and neither its rollout targets nor its DB are touched. - Because rewriting the shared rollout JSONL would race a running backfill's reads (the rollout files are its source of truth), a rollout-rewriting scope (AllRows) combined with any in-progress backfill skips the whole round; --sqlite-only (RolloutScope::None) still writes its ready stores. - main returns Partial(2) / Failed(1) accordingly; the --sqlite-only App warning now fires only when the App store was actually updated. Reviewed in parallel by Codex and a Claude subagent. Codex caught that the first cut still rewrote the shared rollout JSONL (racing the backfill) even while skipping busy stores' DBs; fixed with the whole-round skip above, plus a regression test asserting that a shared rollout and both DBs stay untouched while a backfill runs. 53 tests, clippy and fmt clean.
20e1a7e to
b6dfe0b
Compare
Owner
Author
|
✅ 已随这条 stack 通过快进合并落到 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
PR1(#16)做了多 store 只读发现 + 多库 status。PR2 把一次性写命令
sync/bucket switch的写路径多库化:在并存的 CLI / Codex App 两个state_5.sqlite上都归一 provider,并统一改写共享的 rollout JSONL。共识见 #14。改动
reconcile_all_stores_with_backup:发现所有 store → 先跨库收集 rollout 目标(按 canonical 路径去重)重写一次 → 再逐库备份 +reconcile_sqlite_in_place。逐库失败收敛为StoreOutcome::Failed,不 abort 健康库。<db_parent>/backups/<slug>/state_5.<ts>.bak,CLI / App 互不覆盖,备份永远先于写。main改返回ExitCode——Full→0 / Partial→2 / Failed→1。--sqlite-only+ App 库:打持久性告警(rollout 才是事实源,sqlite-only 改动可能被 backfill 还原)。watch仍走单库reconcile_once(其 MismatchedRows followup 语义另案处理,PR3);多库路径加debug_assert!(scope != MismatchedRows)钉死前提。删除 3 个废弃单库备份函数,无 dead code。并行 Review(Codex × Claude subagent)
UPDATE model_provider仍可能成功 → 报Full却 rollout 没改、DB 改了,违背"rollout 是事实源"。reconcile_rollouts_for_stores返回失败库清单,调用方把这些库标Failed并跳过 DB 写;新增回归测试reconcile_all_stores_fails_store_when_rollout_targets_unreadable(修复前失败、修复后通过)。测试 / 质量
cargo test:50 passed(新增 3 个写路径集成测试:双库全更新 + rollout 去重一次、坏库报 Partial、rollout 读失败的库报 Failed 且 DB 不动)。cargo clippy --all-targets:0;cargo fmt --check:clean。CODEX_HOME跑sync实测退出码 1;真实库status只读回归通过。已知小项(非本 PR 引入)
备份名用毫秒时间戳,同库 1ms 内重复
sync理论上可能覆盖旧备份(PR1 前就有的旧行为)。留作独立小修。