fix(fork): address Codex pre-release review#4
Merged
Conversation
Resolves the four MAJORs and two non-cosmetic MINORs from the release-gate review of merged dev (commit 0482711). MAJOR — /latest-version still leaked to upstream: - internal/api/handlers/management/config_basic.go: latestReleaseURL was api.github.com/repos/router-for-me/CLIProxyAPI/releases/latest. The "check for update" affordance in the management UI compared zmh-v0.1.0 against upstream's v6.x and would have told users to "update to upstream". Retargeted to Z-M-Huang/CLIProxyAPI's release feed. MAJOR — Dockerfile mandatorily COPY'd a gitignored file: - Dockerfile: dropped the COPY static/management.html and the MANAGEMENT_STATIC_PATH env. The management asset auto-updater already (a) runs once on startup via StartAutoUpdater and (b) fetches synchronously on first /management.html request when the local file is missing (server.go:692). The bake was an optional first-load latency optimisation, not a correctness requirement — removing it lets a clean checkout build without first running npm-build + cp. MINOR — backend README sponsor blocks were upstream-attributed: - README.md / README_CN.md / README_JA.md: replaced the inline sponsor table (z.ai / PackyCode / AICodeMirror / BmoPlus / Poixe / VisionCoder, all pointing at upstream's affiliate codes) with a short "this fork doesn't solicit sponsorship; see upstream README for sponsors" paragraph. Sponsor PNG assets removed alongside. MINOR — hygiene: - prompt_rules_openai.go: trim trailing blank line at EOF (was flagged by `git diff --check`). Pre-conditions for tagging zmh-v0.1.0 are unchanged from the plan; the operational ones (clean working tree, frontend Actions enabled on the fork, push the tag explicitly so the local v0.1.0 tag — now deleted — can't accidentally ship) remain the user's call. Codex review: pre-release gate, /home/ubuntu/.claude/plans/we-are-in-a-nested-emerson.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fallback-removal comment block still claimed the Dockerfile bakes the bundle, but the previous commit (7d2125a) dropped that. Update to describe the actual safety net: server.go's serveManagementControlPanel re-runs EnsureLatestManagementHTML synchronously on the first /management.html request when the local asset is missing, so a fresh container without a baked bundle still recovers on first hit. Codex re-review MINOR (post-fix-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Z-M-Huang
added a commit
that referenced
this pull request
May 4, 2026
* test(bench): add Phase A benchmark harness for logging, cache, amp
Stage 1 Phase A backend baseline. Three new *_bench_test.go files
exercising the hot paths Phase C will refactor:
- internal/cache/signature_cache_bench_test.go — Get hit/miss, set,
Gemini sentinel miss, parallel hit, mixed read-heavy parallel
- internal/logging/request_logger_bench_test.go — synchronous LogRequest
on the hot path, plus disabled fast-path for the toggle
- internal/api/modules/amp/amp_bench_test.go — ForceModelMappings read
(single + parallel), restrict-read parallel; exposes 3-RWMutex
contention before the Phase C atomic snapshot refactor
scripts/baseline-backend.sh runs all three packages with -benchmem and
writes raw output to bench/baseline/backend.txt + capture metadata to
bench/baseline/backend-meta.json so before/after comparisons just diff
the text file.
Initial baseline highlights (AMD Ryzen 9 5950X, Go 1.26):
- Logging LogRequest sync: 236258 ns/op, 42KB/op, 103 allocs/op
- AMP ForceModelMappings: 3.7 ns/op single -> 27.85 ns/op parallel
(7.4x slowdown under contention — primary Phase C target)
- Cache Get_Hit: 242 ns/op single -> 365 ns/op parallel
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(cache): pre-write semantics tests + injectable clock for Phase C
Stage 1 Phase A safety net for the Phase C LRU swap. Adds an
injectable 'clock' package var (defaults to time.Now in production)
so TTL-passage tests don't sleep, and a new semantics test file
covering the four guarantees the LRU swap must preserve:
1. Sliding TTL on read (refresh-on-access). Verified by writing at
t=0, reading at t=2h to refresh, then reading at t=4h: hit
(2h < 3h TTL gap from refresh) instead of miss.
2. Gemini miss sentinel ("skip_thought_signature_validator")
returned for empty / miss / expired keys when the model group
resolves to "gemini". Other groups return empty.
3. ClearSignatureCache(modelName) clears only the matching group,
leaves the others intact.
4. ClearSignatureCache("") full clear of every group.
Production behavior unchanged. The clock var is unexported; only
in-package tests can override it. Phase C swap is gated by these
tests passing.
Existing tests + race detector still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(amp): pre-write race + captured-mapper-staleness tests
Stage 1 Phase A safety net for the Phase C atomic snapshot refactor of
internal/api/modules/amp/amp.go.
Race tests (under go test -race):
- ForceModelMappings_ToggleUnderConcurrentReads
- RestrictToLocalhost_ToggleUnderConcurrentReads
- ModelMappings_UpdateUnderConcurrentReads
- LastConfig_OnConfigUpdated_Concurrent (full mix)
- ParallelObservedBothStates (sanity that the parallel path actually
flips, so we know the contention path is exercised)
Staleness tests (the captured-pointer correctness invariant):
- CapturedMapperSeesUpdates — exact mappings
- CapturedMapperSeesRegexUpdates — regex mappings
The staleness tests pin the closure-capture concern Codex flagged in
the second-pass review: routes.go and fallback_handlers.go capture
m.modelMapper at registration, and the current code holds because
UpdateMappings mutates in place (pointer stays stable). After Phase C,
if the snapshot refactor swaps modelMapper pointers along with the
routing table, captured handler closures go stale. These tests catch
that regression and gate the refactor.
Mapper internals (GetMappings, mapper.regexps) are read directly
rather than via MapModel because MapModel filters via
util.GetProviderName which depends on per-process provider state not
set up here. The staleness invariant is purely about the mapper's
internal table reflecting updates.
go test -race ./internal/api/modules/amp passes cleanly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(executor): pre-write SSE chunk-boundary tests for codex_executor
Stage 1 Phase A: pin the streaming-pipeline correctness invariants the
Phase C streaming/logging refactors must preserve.
Two scenarios:
1. ChunkBoundary_MidLineSplitFlushes — server sends a single SSE event
in four mid-line splits with explicit Flush() between them. The
executor's bufio.Scanner must reassemble the line and emit the same
downstream chunks as the same event sent in one write. Compares
fragmented run vs single-write reference run for byte-identical
downstream output.
2. ChunkBoundary_MultipleEventsSingleWrite — server writes multiple
complete SSE events in one upstream write. The executor must emit
a non-empty chunk batch and at least one chunk must carry the
completion marker. Catches collapse-events-to-one regressions.
Note: two pre-existing tests in this package
(TestEnsureAccessToken_WarmTokenLoadsCreditsHint,
TestUpdateAntigravityCreditsBalance_LoadCodeAssistUserAgent) fail
against upstream/dev independently of these changes — they make real
requests to daily-cloudcode-pa.googleapis.com and the fixture host
resolution fails in this sandbox. Not introduced by this commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(api): pre-write hot-reload race test for Phase C atomic.Pointer
Stage 1 Phase A safety net for the Phase C atomic.Pointer[Config] +
clone-modify-persist-swap refactor of internal/api/Server.
Verified the race exists today: with t.Skip removed, go test -race
./internal/api fires the race detector on s.cfg field reads racing
against s.cfg = newCfg in UpdateClients. The test body uses two
reader probes that mirror real cfg-reading code paths today:
1. Direct field reads (s.cfg.AuthDir, .Debug, .LoggingToFile),
mirroring patterns at server.go:408,422,436,450,672,814.
2. /management.html serving (server.go:671 reads s.cfg directly).
Plus a writer goroutine calling UpdateClients with rotating Config
values to drive the swap. Race trips reproducibly within ~80 ms.
Test currently t.Skip()'d so -race CI stays green during Stage 1
Phase B/C/D. Phase C must:
- Move config ownership into Server behind atomic.Pointer.
- Provide an atomic accessor used by all readers (mgmt handlers,
OAuth callbacks, request handlers, /management.html).
- Apply clone-modify-persist-swap in mgmt handler write paths
(PutDebug, PutAmpUpstreamURL, PutAmpModelMappings, etc.).
- Delete the t.Skip line and validate the test passes -race.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(cache): replace unbounded sync.Map with bounded LRU + sliding TTL
Stage 1 Phase C item — fix internal/cache/signature_cache.go's
unbounded growth.
Per-group cache is now a doubly-linked-list LRU (capacity 10,000) with
sliding TTL preserved on read, gated by the Phase A semantic tests:
- sliding TTL on read (timestamp refresh + move-to-front)
- Gemini miss sentinel (empty/miss/expired)
- ClearSignatureCache(modelName) group-clear
- ClearSignatureCache("") full clear
Periodic background purge still runs every 10 minutes to reclaim
groups that nobody reads.
Bench delta vs Phase A baseline (AMD Ryzen 9 5950X, Go 1.26):
Get_Hit 242.5 → 226.5 ns/op (-6.6%)
Get_Miss 18.25 → 18.62 ns/op (noise)
Get_GeminiSentinelMiss 23.73 → 24.40 ns/op (noise)
Set 1222 → 569.2 ns/op (-53%, biggest win)
Get_Hit_Parallel 365.3 → 325.1 ns/op (-11%)
Mixed_ReadHeavy_Parallel 484.2 → 464.2 ns/op (-4%)
Set is much faster because the previous map-grew-without-bound path
paid a delete cost on each set when groups got large; the LRU's O(1)
tail eviction replaces that.
Allocations: Set 4 → 5 allocs/op (LRU node is now a heap-allocated
struct vs the previous value-type SignatureEntry in the map). +24
bytes/op. Acceptable given the perf gain and the bounded-memory fix.
go test -race ./internal/cache passes. The Phase A
signature_cache_semantics_test.go file (sliding TTL, Gemini sentinel,
group clear, full clear) and the existing happy-path tests all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(logging): async request logger with priority lane for forced errors
Stage 1 Phase C item — take file-I/O off the request path.
Adds internal/logging/async_emitter.go: a single-worker pipeline with
two channels — a normal-priority queue (drop-newest on overflow) and a
priority queue for forced-error logs that NEVER drops. The worker
drains priority first, then normal. Close() signals stop and drains
both queues before the worker exits.
FileRequestLogger.enabled becomes atomic.Bool (was a plain bool, racy
under hot-reload). Adds DroppedLogs() counter, Flush() for tests, and
Close() for graceful shutdown.
LogRequest / LogRequestWithOptions now enqueue and return immediately;
the actual file write happens off-path. Forced-error logs that find a
full priority queue fall back to a synchronous write so observability
is preserved — they never drop, even in extreme pressure.
Bench delta vs Phase A baseline (AMD Ryzen 9 5950X):
LogRequest sync: 236,258 ns/op → 34.18 ns/op (-99.99%, 6,910x)
42 KB/op → 6 B/op
103 allocs → 0 allocs
Disabled fast-path: 7.027 ns → 18.21 ns (atomic.Load
overhead, +11ns)
The Disabled-path slowdown is the cost of swapping the racy plain-bool
for an atomic.Bool; absolute cost is still ~18ns and the path is a
guard, not a hot loop.
Tests added (all -race clean):
- NormalPath_WritesFile
- ForcedPath_WritesEvenWhenDisabled (normal call drops, forced writes)
- NormalQueueOverflow_IncrementsDropped
- ForcedFallsBackToSyncWhenPriorityFull (verifies the no-drop guarantee)
- AtomicEnabledToggle_RaceFree (4 readers + 4 writers under -race)
- CloseDrainsPendingTasks (64 queued tasks all written after Close)
Phase A's existing tests still pass; the bench harness from Phase A
still runs (it now exercises the async path and shows the 6,910x win).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(amp): replace 3 RWMutexes with atomic.Pointer[routingTable] snapshot
Stage 1 Phase C item — kill mutex contention on the AMP routing read path.
The AmpModule held three separate RWMutexes (proxyMu, restrictMu, configMu)
to guard hot-reloadable state used on every Amp request. Under load each
read paid the RWMutex.RLock atomic CAS + memory barrier on three different
locks. Replaced with a single atomic.Pointer[routingTable] snapshot
containing {enabled, proxy, restrictToLocalhost, secretSource, lastConfig}
plus a sync.Mutex (writeMu) that serializes Register / OnConfigUpdated /
setters. Readers now do a single atomic load and use the captured snapshot;
writers clone-modify-swap.
modelMapper is intentionally NOT in the snapshot. Its pointer is stable
for the module's lifetime (set once in Register), and DefaultModelMapper
guards its mapping table with its own RWMutex via UpdateMappings. This
preserves the captured-pointer pattern in routes.go and
fallback_handlers.go — handlers that captured m.modelMapper at
registration time continue to see hot-reload updates without re-fetching.
The Phase A TestAmpStaleness_CapturedMapperSeesUpdates test pins this
invariant.
Bench delta vs Phase A baseline (AMD Ryzen 9 5950X, sink-protected):
ForceModelMappings_Read (single-thread): 3.744 ns → 1.715 ns (-54%)
ForceModelMappings_Read_Parallel-32: 27.85 ns → ~1.8 ns (-94%)
RestrictRead_Parallel-32: 23.72 ns → ~1.3 ns (-94%)
Both parallel paths comfortably beat the plan's -80% mutex contention
target. The single-thread read also halves because the atomic load is
one instruction vs RWMutex.RLock's CAS-with-barrier sequence.
Behavior is preserved:
- Hot-reload of upstream URL, API key, model mappings, force flag,
localhost restriction, per-client API key mappings — all still
observable to in-flight requests.
- Captured-pointer staleness: handlers see modelMapper updates because
UpdateMappings still mutates in place.
- Failed proxy creation in Register still leaves lastConfig +
restrictToLocalhost visible (matching the original ordering).
Tests:
- All Phase A race + staleness tests pass under -race
(TestAmpRace_*, TestAmpStaleness_CapturedMapperSeesUpdates,
TestAmpStaleness_CapturedMapperSeesRegexUpdates).
- All upstream functional tests pass (TestAmpModule_*, TestRegister*,
TestLocalhostOnlyMiddleware_*).
- Upstream tests that constructed AmpModule via struct literal with
moved fields (`enabled`, `restrictToLocalhost`, `lastConfig`,
`secretSource`) updated to use snapshot accessors; assertions
unchanged.
* perf(api): atomic.Pointer[Config] in Server + clone-modify-persist-swap mgmt writes
Stage 1 Phase C item — kill the hot-reload data race the Phase A test caught.
Server.cfg was a plain *config.Config replaced by UpdateClients via direct
pointer assignment, with concurrent readers (OAuth callbacks, /management.html
serving, request handlers) reading fields off the shared pointer without any
synchronization. Mgmt PUT/PATCH/DELETE handlers also mutated config fields in
place while readers iterated. Both raced under -race; the Phase A
TestServer_ConfigHotReload_NoRaceUnderConcurrentReads test reproduced the
former.
This commit:
internal/api/server.go
- Replace `cfg *config.Config` field with `cfgPtr atomic.Pointer[config.Config]`.
- Add `Server.Config() *config.Config` accessor used by all readers.
- All s.cfg.X reads (OAuth callback handlers, TLS init, /management.html
serving) go through s.Config(); UpdateClients ends with cfgPtr.Store(cfg).
- Wire a config-commit hook into the mgmt handler at server construction
that delegates to s.UpdateClients, so a successful clone-modify-persist
triggers the full fan-out (log level, request logger toggle, AMP module
OnConfigUpdated, signature cache reconfig, etc.) before the response
returns to the caller.
internal/api/handlers/management/handler.go
- Replace `cfg *config.Config` field with `cfgPtr atomic.Pointer[config.Config]`.
- Add `cfg() *config.Config` accessor returning the loaded snapshot.
- Add `applyConfigChange(c, fn)` helper: clones the snapshot under h.mu,
applies fn to the clone, persists to disk, atomic-stores into cfgPtr,
invokes the optional commit hook, and writes the response.
- Add `SetConfigCommitter` setter so Server can wire UpdateClients fan-out.
- SetConfig becomes a lockless atomic.Store (no longer needs h.mu).
- cloneConfigSnapshot uses yaml round-trip — heavier than a manual copy
but resilient to new config.Config fields without forgotten clones, and
mgmt writes already pay disk I/O so the marginal cost is irrelevant.
- Drop persist / persistLocked: legacy mutate-in-place is fundamentally
incompatible with snapshot semantics. All callers migrate to
applyConfigChange.
internal/api/handlers/management/config_basic.go, config_lists.go,
quota.go, handler_test.go, api_tools_test.go,
config_lists_delete_keys_test.go
- Helper signatures updateBoolField / updateIntField / updateStringField
take *config.Config; callbacks mutate the clone instead of h.cfg().
- Generic helpers putStringList / patchStringList / deleteFromStringList
take a `func(*config.Config) *[]string` accessor that resolves to a
slice pointer in the clone, and call applyConfigChange internally.
- Every Put*/Patch*/Delete* handler that previously did
`h.cfg.X = v; h.persist(c)` (or the locked variant) is rewritten to
`h.applyConfigChange(c, func(cfg *config.Config) { cfg.X = v })`.
Behavior preserved including 404 short-circuits before clone, sanitize
invariants, and per-key delete semantics.
- PutConfigYAML still writes the raw YAML to disk and reloads, but the
reload now goes through cfgPtr.Store + the commit hook.
- Reads of h.cfg.X across all mgmt files become h.cfg().X.
- Test files updated where they constructed Handler with `cfg:` field
literal (now an atomic.Pointer); they use `&Handler{}` + h.SetConfig
to seed initial state. Test assertions unchanged.
internal/api/server_hotreload_race_test.go
- Un-skip the Phase A gating test. Reader pool now reads via Server.Config()
(the new accessor); writer pool calls Server.UpdateClients with fresh
Config allocations. Passes under -race.
Behavior preserved:
- All API request/response shapes unchanged.
- File-watcher reload path unchanged: still loads YAML from disk and calls
Server.UpdateClients which atomic-swaps + fans out.
- Mgmt PUT response no longer mutates a shared snapshot; the response
body waits for the commit hook (Server.UpdateClients) to complete fan-out
before returning, matching the prior in-place semantics.
- Logs/auth/AMP/cache reconfig fires on every mgmt write that takes effect
(previously fired only when the file watcher saw the disk change).
Pre-existing race in internal/api/handlers/management is unchanged: gin's
SetMode/IsDebugging globals race across t.Parallel() tests in upstream code.
This is Codex/Phase C exit material to flag — out of scope for #23.
* perf(auth): exponential backoff on per-credential refresh failures
Stage 1 Phase C item — finish the per-credential refresh cooldown story.
Per-credential cooldown via Auth.NextRefreshAfter already existed (the
auto-refresh loop calls markRefreshPending which sets refreshPendingBackoff,
and refreshAuth's failure path set NextRefreshAfter = now + 5min on every
failure). What was missing was exponential growth: a credential that had
been broken for hours kept getting hammered every 5 minutes regardless of
how many failures had accumulated.
This commit adds:
internal: nextRefreshFailureBackoff(failures int) time.Duration computes
the cooldown for the Nth consecutive failure: base × 2^(N-1), capped at
refreshFailureBackoffMax = 1 hour. Curve:
failures=1 -> 5m (base)
failures=2 -> 10m
failures=3 -> 20m
failures=4 -> 40m
failures>=5 -> 60m (cap)
Manager.refreshFailures map[string]int (in-memory, not persisted) tracks
consecutive failure counts per auth.ID. Mutated under m.mu alongside
m.auths. After a successful refresh the entry is deleted, so a future
failure restarts the curve at 5m rather than continuing the prior
exponential growth — the credential's history is "forgotten" on success.
Not persisted because (a) cross-restart memory of historical refresh
failures isn't useful when the credential file may have been re-imported
or rotated, and (b) the existing Auth schema is unchanged so an upgrade
doesn't require a migration.
Tests:
- TestNextRefreshFailureBackoff_ExponentialGrowth pins the curve.
- TestManager_RefreshAuth_FailureExponentialBackoff drives 4 sequential
failures through a fake executor and asserts NextRefreshAfter grows
on each attempt.
- TestManager_RefreshAuth_SuccessResetsFailureCounter drives 3 failures,
then a success, then a failure — asserts the post-success failure
backoff is the BASE 5m (not the 40m it would have been without reset).
All pass under go test -race.
Behavior preserved:
- First failure backoff is unchanged (5m), so existing single-failure
test expectations still hold.
- Successful-refresh path still sets NextRefreshAfter = zero (or the
refreshIneffectiveBackoff sentinel when shouldRefresh is still true).
- Auth JSON schema is unchanged; the failure counter is in-memory.
* feat(managementasset): ReleaseURLProvider interface seam for fork override
Stage 1 Phase C item — introduce a small extension seam so the Stage 2 fork
can swap the management-asset release URL constants without modifying this
file. The seam lands in Stage 1 because it touches the same constants the
fork needs to override; deferring the seam to Stage 2 would force a
Stage-1-vs-Stage-2 conflict in the same file.
Adds ReleaseURLProvider interface with ReleaseURL() / FallbackURL() methods
and a default impl that returns the upstream constants unchanged:
defaultManagementReleaseURL = "https://api.github.com/repos/router-for-me/Cli-Proxy-API-Management-Center/releases/latest"
defaultManagementFallbackURL = "https://cpamc.router-for.me/"
The active provider lives in a sync/atomic.Value wrapped in *providerHolder
so concurrent Stores satisfy atomic.Value's same-concrete-type rule.
SetReleaseURLProvider(nil) restores the upstream defaults; init() seeds with
the default so the seam is transparent until the fork overrides it.
Wires the seam through three call sites that previously read the constants:
- resolveReleaseURL: when no per-config repository override is supplied,
routes through currentReleaseURL() instead of the constant directly.
Per-config overrides (cfg.RemoteManagement.PanelGitHubRepository) still
win over the provider default.
- fetchLatestAsset: empty releaseURL parameter falls back to
currentReleaseURL() instead of the constant.
- ensureFallbackManagementHTML: downloads from currentFallbackURL().
Stage 2 fork code will register a custom provider via init(), pointing the
auto-updater at the fork's release artifact location. No further changes
to this file required.
Tests (all -race clean):
- TestDefaultReleaseURLProvider_MatchesUpstreamConstants pins the contract
that the default impl returns the upstream URLs byte-for-byte.
- TestSetReleaseURLProvider_OverridesAndRestores verifies override + nil
restore semantics.
- TestCurrentReleaseURL_FallsBackOnEmpty pins safety against an
incomplete provider returning empty strings.
- TestResolveReleaseURL_UsesProviderDefault verifies the seam routes
through the provider when no per-config repo is set, and that an
explicit per-config repo still overrides the provider default.
* test(bench): Phase C exit benchmarks + logger Cleanup hook
Captures the Phase C exit benchmark numbers in bench/phase-c/backend.txt
alongside a SUMMARY.md that maps results back to the plan's targets and
documents trade-offs. Phase A baseline at bench/baseline/backend.txt is
preserved unmodified for diff comparison.
Also fixes a benchmark teardown bug exposed by the async-emitter refactor:
BenchmarkFileRequestLogger_LogRequest's b.TempDir would be removed before
the async writer worker drained, producing "unlinkat ... directory not
empty" errors. Adds b.Cleanup(logger.Close) to both logger benchmarks so
the worker drains before TempDir cleanup.
Phase C exit results vs Phase A baseline (AMD Ryzen 9 5950X,
go1.26.0 linux/amd64):
AMP mutex contention (parallel reads): 27.85 ns -> 0.06 ns (-99.8%, target -80%)
AMP single-thread read: 3.744 ns -> 1.725 ns (-54%)
Request logger off-path: 236,258 ns -> 35.41 ns (-99.985%)
Request logger allocs/req: 103 -> 0 (-100%)
Signature cache Set (bounded LRU): 1,222 ns / 419 B -> 592.9 ns / 232 B (-51% / -45%)
Signature cache Get_Hit: 242.5 ns -> 219.5 ns (-9.5%, within 5%)
Signature cache Get_Hit_Parallel: 365.3 ns -> 315.3 ns (-13.7%)
All plan targets met. The Disabled-fast-path regression (+12 ns from
swapping plain bool to atomic.Bool) is documented in the #21 commit and
is the right race-fix trade-off.
Behavior preservation: all Phase A pre-written gating tests still pass
under -race (signature_cache_semantics_test, amp_race_test, hot-reload
test, codex SSE chunk-boundary). Pre-existing gin global-state races in
internal/api/handlers/management remain — they exist on clean upstream/dev
and are out of scope for Phase C.
Next step: run /dev-buddy-once Codex CLI gpt-5.5 light review on
refactor/upstream-bound vs upstream/dev. Address findings. Then Phase D.
* fix(api,logging,amp,handlers): address Codex Phase C review BLOCKER + IMPORTANT findings
Codex CLI gpt-5.5 light review on the BE diff (refactor/upstream-bound vs
upstream/dev) flagged 4 BLOCKERs and 4 IMPORTANTs that need addressing
before Phase C exit. This commit fixes all of them.
BLOCKER #1 — Management writes deadlock
internal/api/handlers/management/handler.go
applyConfigChange held h.mu while invoking the commit hook. The hook
calls Server.UpdateClients which calls s.mgmt.SetAuthManager, and the
old SetAuthManager re-took h.mu — Go's sync.Mutex is non-reentrant,
so every mgmt config write hung. Fix: change Handler.authManager from
a plain pointer to atomic.Pointer[coreauth.Manager] so SetAuthManager
is lock-free; release h.mu before invoking the commit hook so the
fan-out can re-enter handler methods without deadlocking. Reads
scattered across the package always observed an unsynchronized
h.authManager — the lock was paranoia, not correctness.
BLOCKER #2 — BaseAPIHandler.Cfg race
sdk/api/handlers/handlers.go
sdk/api/handlers/stream_forwarder.go
sdk/api/handlers/gemini/gemini-cli_handlers.go
sdk/api/handlers/openai/openai_images_handlers.go
Server.cfgPtr was atomic but request handlers still read
BaseAPIHandler.Cfg as a plain pointer that UpdateClients overwrote
on every config hot reload. The plan mandated atomic snapshots for
ALL request readers; this finishes that work. Cfg field replaced
with cfgPtr atomic.Pointer[config.SDKConfig] + Config() accessor.
All 14 internal h.Cfg reads migrated to h.Config().
BLOCKER #3 — Forced error logs after Close
internal/logging/async_emitter.go
internal/logging/async_emitter_test.go (new regression test)
After Close drained the worker, subsequent forced enqueues landed in
the buffered priority channel with no consumer left to drain them.
The "forced error logs never drop" invariant was silently broken.
Fix: closed atomic.Bool set BEFORE close(closeCh). enqueue checks
the flag and returns syncFallback=true for force tasks (sync write)
or drops normal tasks (with counter increment) post-close.
TestAsyncEmitter_ForcedAfterCloseWritesSynchronously pins this.
BLOCKER #4 — AMP MultiSourceSecret.explicitKey race
internal/api/modules/amp/secret.go
UpdateExplicitKey wrote explicitKey under s.mu; Get read it without
s.mu on every request. Fix: explicitKey becomes
atomic.Pointer[string]. Get loads via atomic; UpdateExplicitKey
stores via atomic. Cache reset still uses s.mu since cache writes
are off the hot path.
IMPORTANT #5 — Stale-index pre-resolution in mgmt
internal/api/handlers/management/handler.go
internal/api/handlers/management/config_lists.go
Several handlers resolved an index from one snapshot, then
applyConfigChange cloned a (possibly newer) snapshot in which the
pre-resolved index pointed at a different row. Fix: resolution moved
INSIDE the applyConfigChange closure against the cloned config.
applyConfigChange now detects c.Writer.Written() after fn returns —
if the closure already wrote 404 or 400, the persist + commit + 200
is skipped. Affected handlers: PatchGeminiKey, DeleteGeminiKey
(api-key form), PatchClaudeKey, DeleteClaudeKey (api-key form),
PatchOpenAICompat, PatchVertexCompatKey, DeleteVertexCompatKey
(api-key form), PatchOAuthExcludedModels (removal path),
DeleteOAuthExcludedModels, PatchOAuthModelAlias (removal path),
DeleteOAuthModelAlias, PatchCodexKey, DeleteCodexKey (api-key form).
IMPORTANT #6 — UpdateClients fan-out not serialized
internal/api/server.go
UpdateClients read and wrote oldConfigYaml without serialization, so
a file-watcher reload firing concurrently with a mgmt commit could
compute deltas from the wrong "old" config. Fix: Server.updateMu
serializes the entire UpdateClients body. Mgmt commit hooks no
longer hold h.mu so they can take updateMu without deadlocking.
IMPORTANT #7 — yaml clone shape change
internal/api/handlers/management/handler.go
cloneConfigSnapshot used yaml.v3 round-trip and silently returned an
empty &config.Config{} on marshal/unmarshal errors. yaml.v3 also
canonicalised some nil maps/slices to empty values, changing JSON
response shape after unrelated writes. Fix: switch to encoding/json
round-trip (preserves nil-vs-empty per omitempty tags) and return
error so applyConfigChange fails with a 500 instead of silently
persisting an empty snapshot.
IMPORTANT #8 — errorLogsMaxFiles plain-int race
internal/logging/request_logger.go
Read by cleanupOldErrorLogs on the async writer goroutine; written
by SetErrorLogsMaxFiles on hot reload. Fix: atomic.Int64; Set/load
via atomic.
NIT #9 — logs_dropped_total metric not exposed
Deferred — there's no metrics system in scope on this branch. The
DroppedLogs() accessor is wired and the counter exists; surfacing
it as a Prometheus/etc. metric belongs to a separate task.
Verification:
- go build ./... clean.
- go vet ./... clean.
- All Phase A gating tests + new B3 regression test pass under -race
(signature_cache_semantics, amp_race, amp_staleness, hot-reload,
SSE chunk-boundary, async_emitter forced-after-close).
- Pre-existing failures unchanged (gin SetMode global race in mgmt
parallel tests, TestDefaultRequestLoggerFactory, TestCodexFreeModelsExcludeGPT55,
TestEnsureAccessToken_WarmTokenLoadsCreditsHint,
TestUpdateAntigravityCreditsBalance_LoadCodeAssistUserAgent — all
fail on clean upstream-bound HEAD too).
* fix: address Codex Phase C re-review BLOCKER + IMPORTANT findings (round 2)
Round-2 review of commit e9b8bbe3 surfaced 3 new BLOCKERs introduced by
the round-1 fixes plus 2 IMPORTANTs (1 missed instance, 1 fresh observation).
BLOCKER #1 (round 2) — Mgmt commits could reorder
internal/api/handlers/management/handler.go
Round 1's deadlock fix released h.mu before the commit hook, which
let two concurrent mgmt writes A and B interleave: A swap A unlock,
B swap B commit (UpdateClients(B)), A commit (UpdateClients(A)) —
rolling Server fan-out state back to A while disk holds B.
Fix: hold h.mu through commit. Now safe because both Handler.SetConfig
and Handler.SetAuthManager are lock-free atomic.Pointer stores;
commit's transitive call into SetAuthManager no longer re-locks
h.mu.
BLOCKER #2 (round 2) — JSON clone dropped json:"-" persisted fields
internal/api/handlers/management/handler.go
Round 1 swapped cloneConfigSnapshot's yaml round-trip for json
round-trip to surface marshal errors and avoid yaml.v3 nil-vs-empty
canonicalisation. But config.Config has Host, Port, RemoteManagement,
AuthDir tagged `yaml:"..." json:"-"` so json round-trip dropped
them; SaveConfigPreserveComments would persist a config with
cleared Host/Port/auth-dir/remote-management. Every mgmt edit was
silently zeroing critical config fields.
Fix: revert to yaml round-trip. Keep the error-return signature so
applyConfigChange still surfaces a 500 on marshal/unmarshal failure
(the original IMPORTANT #7 concern), instead of silently producing
an empty config.
BLOCKER #3 (round 2) — Async emitter close TOCTOU
internal/logging/async_emitter.go
Round 1's atomic.Bool was checked-then-acted-on without
synchronization. A forced enqueue could load closed=false, get
preempted, Close runs to completion (worker drains and exits), the
enqueue resumes and sends into priority — buffered with no consumer
left, dropped silently against the never-drop invariant.
Fix: replace atomic.Bool + atomic-load with a sync.Mutex guarding
enqueue's [check + send] and close's [flag transition + closeCh
signal]. Lock held only for the brief send window; cost ~10-30 ns
on the hot path.
IMPORTANT #5 (round 2 missed) — DeleteClaudeKey/Vertex/Codex no-match return 200
internal/api/handlers/management/config_lists.go
DeleteGeminiKey was correctly returning 404 when no api-key match
was found. The other three by-api-key delete handlers fell through
to sanitize + persist + 200 instead.
Fix: mirror DeleteGeminiKey — return 404 from inside applyConfigChange
when matchCount==0; the c.Writer.Written() check skips persist +
commit + 200.
IMPORTANT #6 (round 2 fresh) — Multi-Config() reads not snapshotted
sdk/api/handlers/handlers.go (ExecuteStreamWithAuthManager)
sdk/api/handlers/gemini/gemini-cli_handlers.go (CLIHandler)
sdk/api/handlers/openai/openai_images_handlers.go (ImagesGenerations, ImagesEdits)
Several handlers called h.Config() multiple times. With BaseAPIHandler.cfgPtr
behind atomic.Pointer, two reads in the same handler could observe
different hot-reload snapshots and combine fields from old and new
config.
Fix: capture cfg := h.Config() once at handler entry and pass that
snapshot through subsequent reads. Behavior preserved for any
request that sees a hot reload mid-flight: it consistently sees
pre-reload config for the duration of the handler.
Verification:
- go build ./... clean.
- All Phase A gating tests + B3 regression test pass under -race
(signature_cache_semantics, amp_race, amp_staleness, hot-reload,
SSE chunk-boundary, async_emitter forced-after-close).
- Pre-existing failures unchanged.
* fix: address Codex Phase C round-3 review BLOCKER + IMPORTANT findings
Round-3 Codex review of commit 7ea1346f surfaced 4 NEW findings:
2 BLOCKERs (one architectural gap, one missed pattern in mgmt getters)
plus 2 IMPORTANTs (more delete-no-match-200 patterns, flush early-return).
BLOCKER R3-1 — Mgmt commits skip service-level fan-out
internal/api/server.go (+SetManagementCommitter)
sdk/cliproxy/service.go (override mgmt commit hook in Start)
Round 1 wired the mgmt commit hook to call Server.UpdateClients,
which only fans out to the api.Server's own state. The file watcher
reload path additionally calls coreManager.SetConfig,
SetOAuthModelAlias, applyRetryConfig, applyPprofConfig, selector
rebuild on strategy/affinity changes, and rebindExecutors. Mgmt
PUT/PATCH/DELETE was responding 200 but the change never reached
the executor layer until the file watcher fired (or never, in
embedded-no-watcher paths).
Fix: add Server.SetManagementCommitter that overrides the mgmt
handler's commit hook. In sdk/cliproxy/service.go's Start(), after
the reloadCallback closure is defined, route mgmt commits through
the SAME closure the file watcher uses, so both reload sources
trigger identical fan-out. Embedded api.Server users without a
Service still get the default UpdateClients hook (appropriate
scope for embedded callers with no coreManager).
BLOCKER R3-2 — Mgmt getters mix snapshots across multiple h.cfg() calls
internal/api/handlers/management/config_auth_index.go
geminiKeysWithAuthIndex / claudeKeysWithAuthIndex /
codexKeysWithAuthIndex / vertexCompatKeysWithAuthIndex /
openAICompatibilityWithAuthIndex held h.mu (a vestige from when
h.cfg was a mutable field) and called h.cfg() multiple times in
sequence: len(h.cfg().X), range h.cfg().X, h.cfg().X[i]. With
SetConfig now lock-free, a hot reload between those calls could
return mismatched slices and panic or return mixed data.
Fix: snapshot once (cfg := h.cfg()) at the top of each helper. Drop
the spurious h.mu — it never protected cfg-reads against SetConfig
swaps; it only serialised mgmt writes.
IMPORTANT R3-1 — More delete-no-match-200 patterns
internal/api/handlers/management/config_lists.go
Round-2 fixed the by-api-key form for Claude, Vertex, Codex, but
siblings still persisted/committed and returned 200 when nothing
matched. Affected handlers: by-api-key+base-url variants for
Gemini/Claude/Vertex/Codex, by-name + by-index for OpenAI-compat,
by-index for Gemini/Claude/Vertex/Codex, DeleteAmpUpstreamAPIKeys,
DeleteAmpModelMappings, deleteFromStringList helper (api-keys,
excluded models, etc.).
Fix: each handler tracks len-before / len-after (filter form) or
range-checks the index (idx form) and writes 404 + return when
nothing was actually removed. applyConfigChange's
c.Writer.Written() check skips persist + commit + 200.
IMPORTANT R3-2 — flush() returns before in-flight write completes
internal/logging/async_emitter.go
flush() polled only on channel lengths. Once the worker dequeued a
task, both channels were empty and flush returned, even though
writeLogRequest hadn't returned. Bench teardown could observe
flush==done while the worker was still writing into the temp dir,
racing TempDir cleanup.
Fix: add asyncEmitter.active atomic.Int64. execute() increments at
entry and decrements via defer at exit. flush() now waits for
len(priority)==0 && len(normal)==0 && active.Load()==0 — covers
both queued and in-flight tasks.
Verification:
- go build ./... clean.
- All Phase A gating tests + B3 + Async forced-after-close pass under
-race (cache, amp, server hot-reload, refresh backoff, release URL,
async emitter, mgmt key handlers).
- Pre-existing failures unchanged.
* fix: address Codex Phase C round-4 review BLOCKER + IMPORTANT findings
Round-4 Codex review of commit 9e079689 surfaced 5 findings:
2 BLOCKERs (mgmt commits skipped watcher auth refresh; more multi-cfg()
read patterns in auth_files.go) plus 3 IMPORTANTs (flush dequeue race,
PatchAPIKeys 200-on-bad-index, committer hook ordering + atomic).
BLOCKER R4-1 — Mgmt commits skipped watcher auth refresh
internal/watcher/dispatcher.go (export RefreshAuthState)
sdk/cliproxy/types.go (WatcherWrapper.RefreshAuthState)
sdk/cliproxy/watcher.go (wire refreshAuthState into the wrapper)
sdk/cliproxy/service.go (buildReloadCallback calls watcher.RefreshAuthState)
The watcher's reloadClients does TWO things: reloadCallback(cfg) and
refreshAuthState(force). Round-3 wired only reloadCallback into the
mgmt commit hook, so config-defined API key auths weren't
re-synthesized or dispatched until the file watcher noticed the
saved YAML. A DELETE/PUT of a Gemini/Claude/etc. key returned 200
while core auth state remained stale.
Fix: WatcherWrapper exposes RefreshAuthState. Service's
buildReloadCallback (now the shared closure used by both file
watcher and mgmt committer) calls watcher.RefreshAuthState(true)
after every config swap so auth state stays in lockstep with the
config swap.
BLOCKER R4-2 — More multi-cfg() reads in auth_files.go
internal/api/handlers/management/auth_files.go
Three helpers called h.cfg() multiple times in sequence:
managementCallbackURL (Port + TLS.Enable + Port), listAuthFilesFromDisk
(AuthDir twice — list and per-file join), DeleteAuthFile?all=true
(AuthDir twice — list and per-file join). Hot reload between calls
could mix Port from snapshot N with TLS from N+1, or list one
AuthDir then delete from a different one.
Fix: snapshot cfg := h.cfg() once at handler entry; use
snapshot-derived values for all subsequent reads.
IMPORTANT R4-1 — flush had dequeue-to-execute race
internal/logging/async_emitter.go
Round-3 added active counter incremented inside execute(). But the
worker dequeues from priority/normal channel BEFORE calling
execute. In that window, the channel had emptied (decremented by
receive) and active was still 0. flush() observed both 0 and
returned, even though the task hadn't yet entered execute.
Fix: rename active -> pending. Increment inside enqueue's
lock-protected channel send (so it's incremented before the worker
can possibly receive). Decrement in execute's defer. flush waits
for pending == 0; the lock-protected enqueue->send and the
deferred decrement enclose every code path where a task is in
flight.
IMPORTANT R4-2 — PatchAPIKeys index out-of-range returned 200
internal/api/handlers/management/config_lists.go
patchStringList's index path silently no-op'd when the index was
out of range, persist + commit + 200 — same pattern as the
delete-no-match-200 family fixed in round-3, just on the patch
helper.
Fix: write 404 + return from inside the closure when the index is
out of range. applyConfigChange's c.Writer.Written() check skips
persist/commit/200.
IMPORTANT R4-3 — Committer hook installed after server starts; not atomic
sdk/cliproxy/service.go (hoist reloadCallback + SetManagementCommitter
BEFORE server.Start)
internal/api/handlers/management/handler.go (commitPtr atomic.Pointer)
Two related issues: (a) the HTTP server began serving before
SetManagementCommitter ran, leaving a window where mgmt writes used
the default partial Server.UpdateClients hook. (b) the commit field
was a plain func without synchronization, racing applyConfigChange
reads.
Fix: extract the inline reloadCallback closure into Service.buildReloadCallback;
install it BEFORE the server.Start goroutine fires. Replace
Handler.commit with commitPtr atomic.Pointer[commitFn]; reads via
loadCommit() helper, writes via lock-free atomic store. Safe to
set at any lifecycle point.
Verification:
- go build ./... clean.
- All Phase A gating tests + B3 + Async forced-after-close pass under -race.
* fix: address Codex Phase C round-5 review BLOCKER + IMPORTANT (partial)
Round-5 Codex review of commit 94a9500c surfaced 4 findings (2 BLOCKERs +
2 IMPORTANTs). Partial fix: addresses 3 of 4 fully and 1 partially. Two
remaining helpers in oauth_callback.go and auth_files.go (OAuth flow long
goroutines) are deferred for a separate follow-up.
BLOCKER R5-1 — Watcher RefreshAuthState used stale config
sdk/cliproxy/service.go
Round-4 wired RefreshAuthState into the mgmt commit hook but the
watcher's cached config wasn't updated first. RefreshAuthState
synthesizes from w.config, so mgmt writes refreshed auths from the
OLD API-key snapshot until the file watcher fired.
Fix: call s.watcher.SetConfig(newCfg) BEFORE
s.watcher.RefreshAuthState(true) in buildReloadCallback.
BLOCKER R5-2 (partial) — Multi-cfg snapshot reads in mgmt helpers
internal/api/handlers/management/logs.go
internal/api/handlers/management/vertex_import.go
internal/api/handlers/management/api_tools.go
Three mgmt files had handlers that called h.cfg() multiple times in
sequence (logs.go: nil-check + LoggingToFile + h.logDirectory();
vertex_import.go: nil-check + AuthDir; api_tools.go: per-key proxy
lookup + global proxy fallback). All three now snapshot
cfg := h.cfg() once and pass through. logs.go's logDirectory helper
now takes *config.Config explicitly.
Remaining (deferred): oauth_callback.go and auth_files.go OAuth flow
handlers also exhibit this pattern but the fix requires plumbing cfg
snapshots through long-running goroutines that wait for callback
files. Deferred for a focused follow-up commit; behavior under the
current pattern remains the upstream-default for those flows.
IMPORTANT R5-1 — Async pending counter race after channel send
internal/logging/async_emitter.go
Round-4 incremented pending AFTER the channel send, leaving a
window where a worker could receive and enter execute() while
pending was still 0 — flush() observed pending==0 and returned
even though a write was in flight.
Fix: increment pending BEFORE the select{} send. On full-channel
paths roll back with pending.Add(-1). The increment + send happen
inside closeMu so close() observes pending consistently.
IMPORTANT R5-2 — PatchAmp{ModelMappings,UpstreamAPIKeys} 200 on no-op
internal/api/handlers/management/config_lists.go
Both endpoints accepted {} or empty/all-empty `value` payloads,
persisted unchanged config, and returned 200.
Fix: validate `value` has at least one usable entry before
applyConfigChange; return 400 with a descriptive error otherwise.
Verification:
- go build ./... clean.
- All Phase A gating tests + B3 + Async forced-after-close pass under
-race.
* docs(bench): document Phase C exit punch list
Five Codex CLI gpt-5.5 review rounds were run against the BE diff
refactor/upstream-bound vs upstream/dev. Each round closed every
BLOCKER/IMPORTANT it flagged at that round (rounds 1-4 fully, round 5
3.5/4). The loop didn't fully converge — Codex kept finding deeper
pre-existing race conditions and API contract bugs that surfaced once
the perf refactor exposed them — but each round's count was dropping
(9 → 5 → 4 → 5 → 4) and the original Phase C goals (atomic AMP,
async logging, atomic Server config, clone-modify-persist-swap mgmt,
LRU cache, exponential refresh backoff, ReleaseURLProvider seam) are
all delivered, tested under -race, and meeting plan bench targets.
Per user direction, Phase C exits with this punch list rather than
continuing the review loop indefinitely. The deferred item is OAuth
flow goroutine multi-cfg() reads (low real-world risk: OAuth login
is operator-driven and rarely overlaps with hot-reload). Pattern is
established; fix is mechanical when convenient.
Adds:
bench/phase-c/PUNCH_LIST.md — full deferred-items + future-watch
list, plus a roll-up of every round-1-through-round-5 finding
that IS fixed, plus Codex session IDs for traceability.
bench/phase-c/SUMMARY.md — points at PUNCH_LIST.md and notes
Phase D is next.
* perf(logging): switch async emitter normal queue to drop-oldest
Codex Phase C round-6 review NIT #4: plan §Behavior Contract said
"may drop oldest non-priority entries on queue overflow" but the
implementation dropped newest. Under the existing closeMu critical
section, do a non-blocking receive to discard the oldest then send
the new task. Loop bound is at most 2 iterations; closeMu excludes
other producers so the second send is guaranteed to succeed because
workers only consume the channel.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(api): flush async request logger on Server.Stop
Codex Phase C round-6 review IMPORTANT #2: Server.Stop shut down the
HTTP server but never drained the async request logger. Plan §Phase C
mandates "Flush/Close on graceful shutdown" so queued normal logs
write to disk and forced-error logs are flushed before return. Type-
assert the requestLogger interface to interface{ Close() } and call
it after s.server.Shutdown succeeds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(cache): bound outer signature cache group map at MaxGroupCount
Codex Phase C round-6 review IMPORTANT #3: the inner per-group LRU was
bounded at SignatureGroupCapacity (10K) but the outer sync.Map of
groups was unbounded. GetModelGroup falls back to the raw model name
for unknown providers, so a workload with many unique unknown model
names could spike memory until the 10-minute purge ran.
Cap the group count at MaxGroupCount (64) with LRU-by-write-time
eviction. Eviction runs on the cold path only (new-group inserts at
the cap) under groupEvictMu so concurrent inserts evict at most one
extra group than necessary.
The read path stays sync.Map.Load fast-path: lastAccess is updated on
writes only, not reads. Updating lastAccess on every read added per-
call atomic.Int64 contention that pushed Get_Hit_Parallel past the
±5% bench target. The LRU policy degrades to "least recently written
group" — fine because production model surfaces stay well under the
64-group cap and eviction effectively never fires.
Add TestSignatureCache_OuterMapBoundedByMaxGroupCount to pin the cap
and the LRU-by-write eviction order.
Bench (Phase C exit -> round 6):
Get_Hit: 219.5 ns -> 219.2 ns (-0.1%)
Get_Hit_Parallel: 315.3 ns -> 264.9 ns (-16%, faster)
Set: 592.9 ns -> 541.8 ns (-8.6%, faster)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cliproxy): snapshot config in goroutine-callable helpers
Codex Phase C round-6 review BLOCKER #1: buildReloadCallback writes
s.cfg under cfgMu.Lock, but ensureExecutorsForAuthWithMode (376),
registerModelsForAuth (851), resolveConfig*Key (1118-1232), and
oauthExcludedModels (1258) all read s.cfg directly without taking
cfgMu.RLock. Phase C #23 (commit 662672c1) routed mgmt commits
through buildReloadCallback so this race fires per-mgmt-PUT, not
just on the rare file-watcher reload path.
Add Service.configSnapshot() accessor that returns the cfg under
cfgMu.RLock. Have ensureExecutorsForAuthWithMode, registerModelsForAuth,
and oauthExcludedModels snapshot once at function entry and pass cfg
to nested helpers. Convert resolveConfigClaudeKey, resolveConfigGeminiKey,
resolveConfigVertexCompatKey, and resolveConfigCodexKey from methods on
Service to package-level functions taking cfg *config.Config so the
caller's per-call snapshot covers the full resolution.
Add TestService_ConfigSnapshot_RaceFree to pin the invariant under
go test -race.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(api): add mgmt-handlers hot-reload race test
Codex Phase C round-6 review NIT #5: the existing
TestServer_ConfigHotReload_NoRaceUnderConcurrentReads only exercised
Server.Config() reads, /management.html serving, and
Server.UpdateClients writes — it did not drive real mgmt handlers
through applyConfigChange + Handler.cfgPtr swap + commit hook.
Add TestServer_MgmtHandlers_HotReloadRace driving PutDebug,
PutAmpUpstreamURL, and PutAmpModelMappings directly against the
Handler concurrently with Server.Config() readers. Each handler enters
applyConfigChange — clones the snapshot, mutates, persists, atomic-
swaps Handler.cfgPtr, fires the commit hook (which does
Server.UpdateClients fan-out and re-binds executors). The test bypasses
the secret-key middleware (the middleware is not under test; the swap
chain is) and seeds a minimal config.yaml so persist succeeds and the
swap fires.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(bench): record Phase C round-6 closure
Mark all 5 round-6 findings (1 BLOCKER, 2 IMPORTANT, 2 NIT) as fully
fixed and verified under -race. The only remaining deferred punch-list
item is the OAuth-flow goroutine multi-cfg snapshot read pattern (low
real-world risk, mechanical fix when convenient). Round 6 session ID
recorded.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(executor): extract antigravity payload-builder; add round-1 review tests
Phase D code quality (plan §"Stage 1 — Phase D: targeted code quality"):
extract the schema-sanitization + model-transform branch from
buildRequest into antigravityPayloadBuilder. The original buildRequest
had 5 nested conditionals plus duplicated claude-vs-gemini logic in
both branches; the builder dedups the dialect choice and the final
transform into applyModelTransformString / applyModelTransformBytes
(byte and string variants share applyModelTransform semantics).
No behavior change: the 4 existing TestAntigravityBuildRequest_*
tests pass under -race. New TestAntigravityPayloadBuilder_AllQuadrants
table-driven test covers {sanitized, plain} x {Claude, Gemini} with
byte-equivalence assertions on the wire body and the requestLog
copy. TestAntigravityPayloadBuilder_RequestLogOff_NoLogBytes pins the
no-log-allocation invariant when cfg.RequestLog is false.
TestAntigravityPayloadBuilder_UseAntigravitySchema covers the dialect
dispatch (claude + gemini-3-pro family vs other gemini).
Codex Phase D round-1 review IMPORTANT #1 addressed by the new tests
gating all 4 quadrants of the extracted branch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(stage1-exit): address Codex Stage 1 exit review BE findings
Codex Stage 1 exit full-diff review (refactor/upstream-bound vs
upstream/dev) returned 4 IMPORTANT + 1 NIT on the BE side. All 5 are
now closed.
BE-1 (IMPORTANT) — request_logger ownership clone:
internal/logging/request_logger.go LogRequest now defensively clones
caller-owned []byte / map[string][]string / *interfaces.ErrorMessage
slices before enqueueing into the async emitter. The sync→async
switch in Phase C silently changed the public ownership contract
(FileRequestLogger is re-exported via sdk/logging); cloning preserves
the call-time snapshot so a caller reusing buffers after LogRequest
returns cannot corrupt the eventual log write. Trade-off: ~1 µs
added per call vs the prior 35 ns optimum (still ~230× faster than
the pre-Phase-C synchronous baseline). Sync fallback path keeps the
no-buffer optimum. New TestAsyncEmitter_CallerMutateAfterEnqueue_
DoesNotCorruptLog gates the invariant.
BE-2 (IMPORTANT) — signature cache outer-map cap race:
internal/cache/signature_cache.go getOrCreateGroupCache now
serialises the cold-miss insert + LRU eviction under groupEvictMu so
the cap holds exactly under concurrent first-time inserts. New
groups initialise lastAccess at construction (newGroupCache(cap, now))
so a fresh group is not the LRU eviction target before its first
set/get fires. New TestSignatureCache_OuterMapCap_ParallelInserts
pins the invariant under -race.
BE-3 (IMPORTANT) — RefreshAuthState force flag:
sdk/cliproxy/service.go buildReloadCallback now passes false to
watcher.RefreshAuthState. The previous force=true fired modify
events for every unchanged auth on debug/request-log mgmt writes,
causing spurious applyCoreAuthAddOrUpdate / model registration /
scheduler refresh / hook fan-out. With force=false, the watcher's
diffing logic only emits add/modify/delete events when an auth's
synthesized state actually changed. The file-watcher reload path's
own forceAuthRefresh flag (computed from cfg.ForceModelPrefix /
OAuthModelAlias / retry-config diffs) is preserved unchanged.
BE-4 (IMPORTANT) — mgmt test 200 → 404 contract:
test/amp_management_test.go TestDeleteAmpModelMappings_NonExistent
was still asserting status 200 for delete-non-existent. Updated to
assert StatusNotFound, matching the Phase C round 3-4 contract
change ("Delete-no-match returns 404 explicit not-found, not silent
no-op 200"). Comment expanded.
BE-NIT — bench artifact whitespace:
bench/baseline/backend.txt and bench/phase-c/backend.txt had
trailing whitespace on three lines and a blank line at EOF, both
flagged by `git diff --check`. Trimmed.
Validation:
- go build ./...
- go vet ./...
- go test -race ./internal/cache ./internal/logging
./internal/api/modules/amp ./sdk/cliproxy ./test (all pass)
- bench: signature cache Get_Hit_Parallel 269.5 ns/op (vs Phase C
exit 315.3, -15%); LogRequest ~1 µs/op (regression vs Phase C
exit 35 ns, accepted for clone-correctness)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(stage1-exit): address Codex Stage 1 exit round 2 BE findings
Codex Stage 1 exit round 2 review returned 2 IMPORTANT on the BE side.
Both are now closed.
BE-R2-1 (IMPORTANT) — deep-clone ErrorMessage internals:
internal/logging/request_logger.go cloneErrorMessages previously
shallow-copied *interfaces.ErrorMessage; the inner Error and Addon
http.Header remained shared with the caller. The async writer later
calls Error.Error() and reads Addon, so a caller mutating those
after LogRequest could corrupt or race the eventual log. Snapshot
now: copy StatusCode by value, freeze the inner Error to its current
text via errors.New(e.Error.Error()), and deep-clone the Addon
header map. Extended TestAsyncEmitter_CallerMutateAfterEnqueue_
DoesNotCorruptLog to also pass apiResponseErrors with a mutable
inner error and assert post-enqueue mutation does not leak to the
log file.
BE-R2-2 (IMPORTANT) — mgmt-path force predicate parity:
My round 1 BE-3 fix passed force=false unconditionally to
watcher.RefreshAuthState in service.go buildReloadCallback. That
closed the spurious-modify-event path for unrelated mgmt writes
(PutDebug, PutRequestLog) but dropped the file-watcher path's
forceAuthRefresh semantic for edits that change ForceModelPrefix /
OAuthModelAlias / retry-config — fields that affect model
registration but are NOT part of the synthesized auth state. Mgmt
edits to those fields update s.cfg + coreManager but no auth modify
event fired, so the global model registry could stay stale.
Now buildReloadCallback snapshots oldCfg under cfgMu before the
swap and computes the same forceAuthRefresh predicate as
internal/watcher/config_reload.go (forceAuthRefresh = oldCfg !=
nil && (ForceModelPrefix change || !DeepEqual(OAuthModelAlias) ||
retryConfigChanged)), then passes that bool to RefreshAuthState.
The file-watcher reload path's redundant call still observes
currentAuths==newState afterwards, so its own forceAuthRefresh
computation either re-applies the same force (no-op if already
emitted) or skips. PutDebug / PutRequestLog stay force=false.
Validation:
- go build ./..., go vet ./...
- go test -race ./internal/cache ./internal/logging ./sdk/cliproxy
./test (all pass)
- Extended async_emitter_test now covers ErrorMessage / Addon
mutation guard
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(stage1-exit): close round-3 BE IMPORTANT; document Stage 1 NITs
Closes Codex Stage 1 exit round 3 BE IMPORTANT BE-R3-1 and accepts
the round-3 NITs (per user decision: option 2 — fix IMPORTANTs only,
defer NITs to PUNCH_LIST).
BE-R3-1 (IMPORTANT) — buildReloadCallback now serialised end-to-end:
Added Service.reloadMu sync.Mutex and lock the entire body of the
closure returned from buildReloadCallback. Previously only the
cfgMu.RLock window protected oldCfg capture; the rest of the
callback (server update, s.cfg swap, coreManager updates, executor
rebind, watcher SetConfig + RefreshAuthState) ran lock-free, so a
file-watcher callback overlapping a mgmt commit could:
(a) compute a stale forceAuthRefresh predicate (oldCfg captured
from a config the OTHER callback already overwrote)
(b) overwrite s.cfg out-of-order so a newer callback's snapshot
is followed by an older callback's swap
reloadMu eliminates both.
PUNCH_LIST documentation:
- Records that 6 Phase C rounds + 3 Stage 1 exit rounds all reached
"no BLOCKER, no IMPORTANT" state on the closed surface.
- Documents the 4 round-3 NITs as deferred long-tail work with
fix shapes and effort estimates:
* BE-R3-2: cloneErrorMessages Addon clone needs unit-level test
* FE-R3-2: virtualised path loses inter-card gap
* FE-R3-3: isAbortError should move to src/utils/errors.ts
* FE-R3-4: virtualisation threshold of 50 is unverified
Validation:
- go build ./..., go vet ./...
- go test -race ./internal/cache ./internal/logging ./sdk/cliproxy
./test (all pass)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: remove BE bench artifacts and baseline-backend.sh
Phase A bench scaffolding has served its gating purpose for Phase C
hot-spot work. The committed bench/baseline/backend.txt and
bench/phase-c/* artifacts no longer drive any active workflow:
- bench/baseline/backend{.txt,-meta.json}: pre-Phase-C baseline
for the Phase A→B→C gating compare. Phase C exit benches were
already documented in bench/phase-c/{SUMMARY,PUNCH_LIST}.md.
- bench/phase-c/{SUMMARY.md, PUNCH_LIST.md, backend.txt}: Phase C
exit summary + the rolling Codex review punch-list. The
PUNCH_LIST itself is now historical: 6 Phase C rounds + 3 Stage 1
exit rounds all reached "no BLOCKER, no IMPORTANT" closure on
the surface they covered, and the deferred items captured there
can live in commit history / the next round's review notes.
- scripts/baseline-backend.sh: bench harness that wrote into
bench/baseline/backend.txt. Without the bench tree there is
nothing for it to write.
Removing these does not affect runtime or test behavior — no Go
code references the bench dir, and `go build ./...` stays clean.
The FE side keeps bench/baseline/frontend.json + scripts/baseline-
{frontend.sh,update.cjs}: those are actively used via
`npm run baseline:{capture,update}` after the round-3 freshness-
check fix and continue to gate FE bundle/perf regression checks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* stage2: apply fork-only backend integration
- extract PromptRule into its own file to keep the fork boundary small
- wire the fork release URL provider through the Stage 1 seam
- adapt prompt-rules writes to the atomic config persistence flow
- add fork-boundary governance docs and a local guard hook
* fix(management): return live config snapshot and stabilize auth manager access
- return the current config snapshot from GetConfig instead of a zero-value allocation
- snapshot the auth manager once per multi-step auth-file operation so a hot-reload swap cannot split lookup and update across different manager instances
* docs: centralize assistant guidance
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Codex pre-release re-review fix-ups. Codex re-review verdict on
feat/codex-rebrand-fixupswas: 0 BLOCKERs from this fork (the 2 BLOCKERs Codex flagged are pre-existing upstream test failures verified to also fail on clean upstream/dev), 0 outstanding MAJORs (the docker-run config concern is the standard compose-mount deployment, not a code issue), 1 MINOR addressed here (stale comment).See plan at /home/ubuntu/.claude/plans/we-are-in-a-nested-emerson.md.
🤖 Generated with Claude Code