Skip to content

chore(fork): rebrand to Z-M-Huang/zhironghuang namespace#3

Merged
Z-M-Huang merged 2 commits intodevfrom
feat/rebrand-fork
May 2, 2026
Merged

chore(fork): rebrand to Z-M-Huang/zhironghuang namespace#3
Z-M-Huang merged 2 commits intodevfrom
feat/rebrand-fork

Conversation

@Z-M-Huang
Copy link
Copy Markdown
Owner

Phase 4 of the soft-fork plan. Codex-gated (1 BLOCKER fixed: frontend release workflow retagged to zmh-v*; 2 MINORs fixed: docker-build.ps1 namespacing + fork-notice logging-v2 mention).

See /home/ubuntu/.claude/plans/we-are-in-a-nested-emerson.md.

🤖 Generated with Claude Code

Z-M-Huang and others added 2 commits May 2, 2026 01:31
Backend localization for the soft-fork strategy. All edits stay confined
to a small file set so future `git merge upstream/dev` runs only conflict
in expected places.

Image/registry:
- docker-compose.yml: default image is now zhironghuang/cli-proxy-api:latest
- docker-build.sh: local build tag is now zhironghuang/cli-proxy-api:local
- Dockerfile: COPY static/management.html into the image so a freshly-
  pulled container serves the UI on first request, before the auto-
  updater's first 3-hour tick. Sets MANAGEMENT_STATIC_PATH at the same
  spot to point at the baked location.

Frontend self-updater retargeting:
- internal/config/config.go: DefaultPanelGitHubRepository now points at
  Z-M-Huang/Cli-Proxy-API-Management-Center. Without this, our default
  silently leaks fresh installs back to upstream's UI.
- config.example.yaml: panel-github-repository hardcodes the same fork
  URL — example-derived configs would otherwise override the in-binary
  default and auto-update from upstream.
- internal/managementasset/updater.go: defaultManagementReleaseURL now
  points at the fork's releases/latest. The fork-operated fallback page
  upstream relied on (https://cpamc.router-for.me/) is removed entirely
  — defaultManagementFallbackURL constant, ensureFallbackManagementHTML
  function, and both call sites are gone. New runtime semantics: if the
  GitHub fetch fails, leave any existing local asset in place; if no
  local asset, /management.html 404s until the next tick succeeds. The
  Dockerfile baking step guarantees fresh containers always have a
  usable local asset.

Workflows / GitHub-side metadata:
- .github/workflows/docker-image.yml: deleted. We push from local with
  buildx; we deliberately don't set DOCKERHUB_USERNAME/DOCKERHUB_TOKEN
  as GitHub secrets, so the workflow would otherwise fail on every tag
  push.
- .github/workflows/release.yaml: deleted. Same tag-trigger reason; we
  don't ship binaries from the fork, only docker images.
- .github/FUNDING.yml: deleted. Don't carry upstream's funding pointers.

READMEs: fork-notice block at the top of README.md / _CN.md / _JA.md.

Phases per /home/ubuntu/.claude/plans/we-are-in-a-nested-emerson.md.
Codex review gated this rebrand diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docker-build.ps1: namespace the local image tag to
  zhironghuang/cli-proxy-api:local so the PowerShell source-build path
  matches docker-build.sh.
- README / README_CN / README_JA: fork-notice now also calls out that
  revived logging support is planned for v0.2.0 (the original feat/logging
  was deferred during the soft-fork bootstrap because upstream removed
  the surrounding plugin/usage surface).

Codex rebrand-review verdict: 1 BLOCKER addressed in the frontend
release workflow, 2 MINORs (this commit + the docker-build.ps1 nit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Z-M-Huang Z-M-Huang merged commit 0482711 into dev May 2, 2026
3 checks passed
@Z-M-Huang Z-M-Huang deleted the feat/rebrand-fork branch May 2, 2026 11:22
Z-M-Huang added a commit that referenced this pull request May 4, 2026
* test(bench): add Phase A benchmark harness for logging, cache, amp

Stage 1 Phase A backend baseline. Three new *_bench_test.go files
exercising the hot paths Phase C will refactor:

- internal/cache/signature_cache_bench_test.go — Get hit/miss, set,
  Gemini sentinel miss, parallel hit, mixed read-heavy parallel
- internal/logging/request_logger_bench_test.go — synchronous LogRequest
  on the hot path, plus disabled fast-path for the toggle
- internal/api/modules/amp/amp_bench_test.go — ForceModelMappings read
  (single + parallel), restrict-read parallel; exposes 3-RWMutex
  contention before the Phase C atomic snapshot refactor

scripts/baseline-backend.sh runs all three packages with -benchmem and
writes raw output to bench/baseline/backend.txt + capture metadata to
bench/baseline/backend-meta.json so before/after comparisons just diff
the text file.

Initial baseline highlights (AMD Ryzen 9 5950X, Go 1.26):
- Logging LogRequest sync: 236258 ns/op, 42KB/op, 103 allocs/op
- AMP ForceModelMappings: 3.7 ns/op single -> 27.85 ns/op parallel
  (7.4x slowdown under contention — primary Phase C target)
- Cache Get_Hit: 242 ns/op single -> 365 ns/op parallel

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(cache): pre-write semantics tests + injectable clock for Phase C

Stage 1 Phase A safety net for the Phase C LRU swap. Adds an
injectable 'clock' package var (defaults to time.Now in production)
so TTL-passage tests don't sleep, and a new semantics test file
covering the four guarantees the LRU swap must preserve:

  1. Sliding TTL on read (refresh-on-access). Verified by writing at
     t=0, reading at t=2h to refresh, then reading at t=4h: hit
     (2h < 3h TTL gap from refresh) instead of miss.
  2. Gemini miss sentinel ("skip_thought_signature_validator")
     returned for empty / miss / expired keys when the model group
     resolves to "gemini". Other groups return empty.
  3. ClearSignatureCache(modelName) clears only the matching group,
     leaves the others intact.
  4. ClearSignatureCache("") full clear of every group.

Production behavior unchanged. The clock var is unexported; only
in-package tests can override it. Phase C swap is gated by these
tests passing.

Existing tests + race detector still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(amp): pre-write race + captured-mapper-staleness tests

Stage 1 Phase A safety net for the Phase C atomic snapshot refactor of
internal/api/modules/amp/amp.go.

Race tests (under go test -race):
  - ForceModelMappings_ToggleUnderConcurrentReads
  - RestrictToLocalhost_ToggleUnderConcurrentReads
  - ModelMappings_UpdateUnderConcurrentReads
  - LastConfig_OnConfigUpdated_Concurrent (full mix)
  - ParallelObservedBothStates (sanity that the parallel path actually
    flips, so we know the contention path is exercised)

Staleness tests (the captured-pointer correctness invariant):
  - CapturedMapperSeesUpdates — exact mappings
  - CapturedMapperSeesRegexUpdates — regex mappings

The staleness tests pin the closure-capture concern Codex flagged in
the second-pass review: routes.go and fallback_handlers.go capture
m.modelMapper at registration, and the current code holds because
UpdateMappings mutates in place (pointer stays stable). After Phase C,
if the snapshot refactor swaps modelMapper pointers along with the
routing table, captured handler closures go stale. These tests catch
that regression and gate the refactor.

Mapper internals (GetMappings, mapper.regexps) are read directly
rather than via MapModel because MapModel filters via
util.GetProviderName which depends on per-process provider state not
set up here. The staleness invariant is purely about the mapper's
internal table reflecting updates.

go test -race ./internal/api/modules/amp passes cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(executor): pre-write SSE chunk-boundary tests for codex_executor

Stage 1 Phase A: pin the streaming-pipeline correctness invariants the
Phase C streaming/logging refactors must preserve.

Two scenarios:

  1. ChunkBoundary_MidLineSplitFlushes — server sends a single SSE event
     in four mid-line splits with explicit Flush() between them. The
     executor's bufio.Scanner must reassemble the line and emit the same
     downstream chunks as the same event sent in one write. Compares
     fragmented run vs single-write reference run for byte-identical
     downstream output.

  2. ChunkBoundary_MultipleEventsSingleWrite — server writes multiple
     complete SSE events in one upstream write. The executor must emit
     a non-empty chunk batch and at least one chunk must carry the
     completion marker. Catches collapse-events-to-one regressions.

Note: two pre-existing tests in this package
(TestEnsureAccessToken_WarmTokenLoadsCreditsHint,
TestUpdateAntigravityCreditsBalance_LoadCodeAssistUserAgent) fail
against upstream/dev independently of these changes — they make real
requests to daily-cloudcode-pa.googleapis.com and the fixture host
resolution fails in this sandbox. Not introduced by this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(api): pre-write hot-reload race test for Phase C atomic.Pointer

Stage 1 Phase A safety net for the Phase C atomic.Pointer[Config] +
clone-modify-persist-swap refactor of internal/api/Server.

Verified the race exists today: with t.Skip removed, go test -race
./internal/api fires the race detector on s.cfg field reads racing
against s.cfg = newCfg in UpdateClients. The test body uses two
reader probes that mirror real cfg-reading code paths today:

  1. Direct field reads (s.cfg.AuthDir, .Debug, .LoggingToFile),
     mirroring patterns at server.go:408,422,436,450,672,814.
  2. /management.html serving (server.go:671 reads s.cfg directly).

Plus a writer goroutine calling UpdateClients with rotating Config
values to drive the swap. Race trips reproducibly within ~80 ms.

Test currently t.Skip()'d so -race CI stays green during Stage 1
Phase B/C/D. Phase C must:
  - Move config ownership into Server behind atomic.Pointer.
  - Provide an atomic accessor used by all readers (mgmt handlers,
    OAuth callbacks, request handlers, /management.html).
  - Apply clone-modify-persist-swap in mgmt handler write paths
    (PutDebug, PutAmpUpstreamURL, PutAmpModelMappings, etc.).
  - Delete the t.Skip line and validate the test passes -race.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(cache): replace unbounded sync.Map with bounded LRU + sliding TTL

Stage 1 Phase C item — fix internal/cache/signature_cache.go's
unbounded growth.

Per-group cache is now a doubly-linked-list LRU (capacity 10,000) with
sliding TTL preserved on read, gated by the Phase A semantic tests:
  - sliding TTL on read (timestamp refresh + move-to-front)
  - Gemini miss sentinel (empty/miss/expired)
  - ClearSignatureCache(modelName) group-clear
  - ClearSignatureCache("") full clear

Periodic background purge still runs every 10 minutes to reclaim
groups that nobody reads.

Bench delta vs Phase A baseline (AMD Ryzen 9 5950X, Go 1.26):
  Get_Hit                     242.5 →  226.5 ns/op  (-6.6%)
  Get_Miss                    18.25 →  18.62 ns/op  (noise)
  Get_GeminiSentinelMiss      23.73 →  24.40 ns/op  (noise)
  Set                         1222  →  569.2 ns/op  (-53%, biggest win)
  Get_Hit_Parallel            365.3 →  325.1 ns/op  (-11%)
  Mixed_ReadHeavy_Parallel    484.2 →  464.2 ns/op  (-4%)

Set is much faster because the previous map-grew-without-bound path
paid a delete cost on each set when groups got large; the LRU's O(1)
tail eviction replaces that.

Allocations: Set 4 → 5 allocs/op (LRU node is now a heap-allocated
struct vs the previous value-type SignatureEntry in the map). +24
bytes/op. Acceptable given the perf gain and the bounded-memory fix.

go test -race ./internal/cache passes. The Phase A
signature_cache_semantics_test.go file (sliding TTL, Gemini sentinel,
group clear, full clear) and the existing happy-path tests all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(logging): async request logger with priority lane for forced errors

Stage 1 Phase C item — take file-I/O off the request path.

Adds internal/logging/async_emitter.go: a single-worker pipeline with
two channels — a normal-priority queue (drop-newest on overflow) and a
priority queue for forced-error logs that NEVER drops. The worker
drains priority first, then normal. Close() signals stop and drains
both queues before the worker exits.

FileRequestLogger.enabled becomes atomic.Bool (was a plain bool, racy
under hot-reload). Adds DroppedLogs() counter, Flush() for tests, and
Close() for graceful shutdown.

LogRequest / LogRequestWithOptions now enqueue and return immediately;
the actual file write happens off-path. Forced-error logs that find a
full priority queue fall back to a synchronous write so observability
is preserved — they never drop, even in extreme pressure.

Bench delta vs Phase A baseline (AMD Ryzen 9 5950X):
  LogRequest sync:    236,258 ns/op → 34.18 ns/op  (-99.99%, 6,910x)
                       42 KB/op    →   6 B/op
                      103 allocs   →    0 allocs
  Disabled fast-path:    7.027 ns  →  18.21 ns     (atomic.Load
                                                    overhead, +11ns)

The Disabled-path slowdown is the cost of swapping the racy plain-bool
for an atomic.Bool; absolute cost is still ~18ns and the path is a
guard, not a hot loop.

Tests added (all -race clean):
  - NormalPath_WritesFile
  - ForcedPath_WritesEvenWhenDisabled (normal call drops, forced writes)
  - NormalQueueOverflow_IncrementsDropped
  - ForcedFallsBackToSyncWhenPriorityFull (verifies the no-drop guarantee)
  - AtomicEnabledToggle_RaceFree (4 readers + 4 writers under -race)
  - CloseDrainsPendingTasks (64 queued tasks all written after Close)

Phase A's existing tests still pass; the bench harness from Phase A
still runs (it now exercises the async path and shows the 6,910x win).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(amp): replace 3 RWMutexes with atomic.Pointer[routingTable] snapshot

Stage 1 Phase C item — kill mutex contention on the AMP routing read path.

The AmpModule held three separate RWMutexes (proxyMu, restrictMu, configMu)
to guard hot-reloadable state used on every Amp request. Under load each
read paid the RWMutex.RLock atomic CAS + memory barrier on three different
locks. Replaced with a single atomic.Pointer[routingTable] snapshot
containing {enabled, proxy, restrictToLocalhost, secretSource, lastConfig}
plus a sync.Mutex (writeMu) that serializes Register / OnConfigUpdated /
setters. Readers now do a single atomic load and use the captured snapshot;
writers clone-modify-swap.

modelMapper is intentionally NOT in the snapshot. Its pointer is stable
for the module's lifetime (set once in Register), and DefaultModelMapper
guards its mapping table with its own RWMutex via UpdateMappings. This
preserves the captured-pointer pattern in routes.go and
fallback_handlers.go — handlers that captured m.modelMapper at
registration time continue to see hot-reload updates without re-fetching.
The Phase A TestAmpStaleness_CapturedMapperSeesUpdates test pins this
invariant.

Bench delta vs Phase A baseline (AMD Ryzen 9 5950X, sink-protected):
  ForceModelMappings_Read (single-thread): 3.744 ns → 1.715 ns  (-54%)
  ForceModelMappings_Read_Parallel-32:    27.85 ns →  ~1.8 ns   (-94%)
  RestrictRead_Parallel-32:               23.72 ns →  ~1.3 ns   (-94%)

Both parallel paths comfortably beat the plan's -80% mutex contention
target. The single-thread read also halves because the atomic load is
one instruction vs RWMutex.RLock's CAS-with-barrier sequence.

Behavior is preserved:
  - Hot-reload of upstream URL, API key, model mappings, force flag,
    localhost restriction, per-client API key mappings — all still
    observable to in-flight requests.
  - Captured-pointer staleness: handlers see modelMapper updates because
    UpdateMappings still mutates in place.
  - Failed proxy creation in Register still leaves lastConfig +
    restrictToLocalhost visible (matching the original ordering).

Tests:
  - All Phase A race + staleness tests pass under -race
    (TestAmpRace_*, TestAmpStaleness_CapturedMapperSeesUpdates,
     TestAmpStaleness_CapturedMapperSeesRegexUpdates).
  - All upstream functional tests pass (TestAmpModule_*, TestRegister*,
    TestLocalhostOnlyMiddleware_*).
  - Upstream tests that constructed AmpModule via struct literal with
    moved fields (`enabled`, `restrictToLocalhost`, `lastConfig`,
    `secretSource`) updated to use snapshot accessors; assertions
    unchanged.

* perf(api): atomic.Pointer[Config] in Server + clone-modify-persist-swap mgmt writes

Stage 1 Phase C item — kill the hot-reload data race the Phase A test caught.

Server.cfg was a plain *config.Config replaced by UpdateClients via direct
pointer assignment, with concurrent readers (OAuth callbacks, /management.html
serving, request handlers) reading fields off the shared pointer without any
synchronization. Mgmt PUT/PATCH/DELETE handlers also mutated config fields in
place while readers iterated. Both raced under -race; the Phase A
TestServer_ConfigHotReload_NoRaceUnderConcurrentReads test reproduced the
former.

This commit:

internal/api/server.go
  - Replace `cfg *config.Config` field with `cfgPtr atomic.Pointer[config.Config]`.
  - Add `Server.Config() *config.Config` accessor used by all readers.
  - All s.cfg.X reads (OAuth callback handlers, TLS init, /management.html
    serving) go through s.Config(); UpdateClients ends with cfgPtr.Store(cfg).
  - Wire a config-commit hook into the mgmt handler at server construction
    that delegates to s.UpdateClients, so a successful clone-modify-persist
    triggers the full fan-out (log level, request logger toggle, AMP module
    OnConfigUpdated, signature cache reconfig, etc.) before the response
    returns to the caller.

internal/api/handlers/management/handler.go
  - Replace `cfg *config.Config` field with `cfgPtr atomic.Pointer[config.Config]`.
  - Add `cfg() *config.Config` accessor returning the loaded snapshot.
  - Add `applyConfigChange(c, fn)` helper: clones the snapshot under h.mu,
    applies fn to the clone, persists to disk, atomic-stores into cfgPtr,
    invokes the optional commit hook, and writes the response.
  - Add `SetConfigCommitter` setter so Server can wire UpdateClients fan-out.
  - SetConfig becomes a lockless atomic.Store (no longer needs h.mu).
  - cloneConfigSnapshot uses yaml round-trip — heavier than a manual copy
    but resilient to new config.Config fields without forgotten clones, and
    mgmt writes already pay disk I/O so the marginal cost is irrelevant.
  - Drop persist / persistLocked: legacy mutate-in-place is fundamentally
    incompatible with snapshot semantics. All callers migrate to
    applyConfigChange.

internal/api/handlers/management/config_basic.go, config_lists.go,
quota.go, handler_test.go, api_tools_test.go,
config_lists_delete_keys_test.go
  - Helper signatures updateBoolField / updateIntField / updateStringField
    take *config.Config; callbacks mutate the clone instead of h.cfg().
  - Generic helpers putStringList / patchStringList / deleteFromStringList
    take a `func(*config.Config) *[]string` accessor that resolves to a
    slice pointer in the clone, and call applyConfigChange internally.
  - Every Put*/Patch*/Delete* handler that previously did
    `h.cfg.X = v; h.persist(c)` (or the locked variant) is rewritten to
    `h.applyConfigChange(c, func(cfg *config.Config) { cfg.X = v })`.
    Behavior preserved including 404 short-circuits before clone, sanitize
    invariants, and per-key delete semantics.
  - PutConfigYAML still writes the raw YAML to disk and reloads, but the
    reload now goes through cfgPtr.Store + the commit hook.
  - Reads of h.cfg.X across all mgmt files become h.cfg().X.
  - Test files updated where they constructed Handler with `cfg:` field
    literal (now an atomic.Pointer); they use `&Handler{}` + h.SetConfig
    to seed initial state. Test assertions unchanged.

internal/api/server_hotreload_race_test.go
  - Un-skip the Phase A gating test. Reader pool now reads via Server.Config()
    (the new accessor); writer pool calls Server.UpdateClients with fresh
    Config allocations. Passes under -race.

Behavior preserved:
  - All API request/response shapes unchanged.
  - File-watcher reload path unchanged: still loads YAML from disk and calls
    Server.UpdateClients which atomic-swaps + fans out.
  - Mgmt PUT response no longer mutates a shared snapshot; the response
    body waits for the commit hook (Server.UpdateClients) to complete fan-out
    before returning, matching the prior in-place semantics.
  - Logs/auth/AMP/cache reconfig fires on every mgmt write that takes effect
    (previously fired only when the file watcher saw the disk change).

Pre-existing race in internal/api/handlers/management is unchanged: gin's
SetMode/IsDebugging globals race across t.Parallel() tests in upstream code.
This is Codex/Phase C exit material to flag — out of scope for #23.

* perf(auth): exponential backoff on per-credential refresh failures

Stage 1 Phase C item — finish the per-credential refresh cooldown story.

Per-credential cooldown via Auth.NextRefreshAfter already existed (the
auto-refresh loop calls markRefreshPending which sets refreshPendingBackoff,
and refreshAuth's failure path set NextRefreshAfter = now + 5min on every
failure). What was missing was exponential growth: a credential that had
been broken for hours kept getting hammered every 5 minutes regardless of
how many failures had accumulated.

This commit adds:

internal: nextRefreshFailureBackoff(failures int) time.Duration computes
the cooldown for the Nth consecutive failure: base × 2^(N-1), capped at
refreshFailureBackoffMax = 1 hour. Curve:

  failures=1 -> 5m   (base)
  failures=2 -> 10m
  failures=3 -> 20m
  failures=4 -> 40m
  failures>=5 -> 60m (cap)

Manager.refreshFailures map[string]int (in-memory, not persisted) tracks
consecutive failure counts per auth.ID. Mutated under m.mu alongside
m.auths. After a successful refresh the entry is deleted, so a future
failure restarts the curve at 5m rather than continuing the prior
exponential growth — the credential's history is "forgotten" on success.
Not persisted because (a) cross-restart memory of historical refresh
failures isn't useful when the credential file may have been re-imported
or rotated, and (b) the existing Auth schema is unchanged so an upgrade
doesn't require a migration.

Tests:
  - TestNextRefreshFailureBackoff_ExponentialGrowth pins the curve.
  - TestManager_RefreshAuth_FailureExponentialBackoff drives 4 sequential
    failures through a fake executor and asserts NextRefreshAfter grows
    on each attempt.
  - TestManager_RefreshAuth_SuccessResetsFailureCounter drives 3 failures,
    then a success, then a failure — asserts the post-success failure
    backoff is the BASE 5m (not the 40m it would have been without reset).

All pass under go test -race.

Behavior preserved:
  - First failure backoff is unchanged (5m), so existing single-failure
    test expectations still hold.
  - Successful-refresh path still sets NextRefreshAfter = zero (or the
    refreshIneffectiveBackoff sentinel when shouldRefresh is still true).
  - Auth JSON schema is unchanged; the failure counter is in-memory.

* feat(managementasset): ReleaseURLProvider interface seam for fork override

Stage 1 Phase C item — introduce a small extension seam so the Stage 2 fork
can swap the management-asset release URL constants without modifying this
file. The seam lands in Stage 1 because it touches the same constants the
fork needs to override; deferring the seam to Stage 2 would force a
Stage-1-vs-Stage-2 conflict in the same file.

Adds ReleaseURLProvider interface with ReleaseURL() / FallbackURL() methods
and a default impl that returns the upstream constants unchanged:

  defaultManagementReleaseURL  = "https://api.github.com/repos/router-for-me/Cli-Proxy-API-Management-Center/releases/latest"
  defaultManagementFallbackURL = "https://cpamc.router-for.me/"

The active provider lives in a sync/atomic.Value wrapped in *providerHolder
so concurrent Stores satisfy atomic.Value's same-concrete-type rule.
SetReleaseURLProvider(nil) restores the upstream defaults; init() seeds with
the default so the seam is transparent until the fork overrides it.

Wires the seam through three call sites that previously read the constants:

  - resolveReleaseURL: when no per-config repository override is supplied,
    routes through currentReleaseURL() instead of the constant directly.
    Per-config overrides (cfg.RemoteManagement.PanelGitHubRepository) still
    win over the provider default.
  - fetchLatestAsset: empty releaseURL parameter falls back to
    currentReleaseURL() instead of the constant.
  - ensureFallbackManagementHTML: downloads from currentFallbackURL().

Stage 2 fork code will register a custom provider via init(), pointing the
auto-updater at the fork's release artifact location. No further changes
to this file required.

Tests (all -race clean):
  - TestDefaultReleaseURLProvider_MatchesUpstreamConstants pins the contract
    that the default impl returns the upstream URLs byte-for-byte.
  - TestSetReleaseURLProvider_OverridesAndRestores verifies override + nil
    restore semantics.
  - TestCurrentReleaseURL_FallsBackOnEmpty pins safety against an
    incomplete provider returning empty strings.
  - TestResolveReleaseURL_UsesProviderDefault verifies the seam routes
    through the provider when no per-config repo is set, and that an
    explicit per-config repo still overrides the provider default.

* test(bench): Phase C exit benchmarks + logger Cleanup hook

Captures the Phase C exit benchmark numbers in bench/phase-c/backend.txt
alongside a SUMMARY.md that maps results back to the plan's targets and
documents trade-offs. Phase A baseline at bench/baseline/backend.txt is
preserved unmodified for diff comparison.

Also fixes a benchmark teardown bug exposed by the async-emitter refactor:
BenchmarkFileRequestLogger_LogRequest's b.TempDir would be removed before
the async writer worker drained, producing "unlinkat ... directory not
empty" errors. Adds b.Cleanup(logger.Close) to both logger benchmarks so
the worker drains before TempDir cleanup.

Phase C exit results vs Phase A baseline (AMD Ryzen 9 5950X,
go1.26.0 linux/amd64):

  AMP mutex contention (parallel reads):  27.85 ns -> 0.06 ns  (-99.8%, target -80%)
  AMP single-thread read:                 3.744 ns -> 1.725 ns (-54%)
  Request logger off-path:                236,258 ns -> 35.41 ns (-99.985%)
  Request logger allocs/req:              103 -> 0 (-100%)
  Signature cache Set (bounded LRU):      1,222 ns / 419 B -> 592.9 ns / 232 B (-51% / -45%)
  Signature cache Get_Hit:                242.5 ns -> 219.5 ns (-9.5%, within 5%)
  Signature cache Get_Hit_Parallel:       365.3 ns -> 315.3 ns (-13.7%)

All plan targets met. The Disabled-fast-path regression (+12 ns from
swapping plain bool to atomic.Bool) is documented in the #21 commit and
is the right race-fix trade-off.

Behavior preservation: all Phase A pre-written gating tests still pass
under -race (signature_cache_semantics_test, amp_race_test, hot-reload
test, codex SSE chunk-boundary). Pre-existing gin global-state races in
internal/api/handlers/management remain — they exist on clean upstream/dev
and are out of scope for Phase C.

Next step: run /dev-buddy-once Codex CLI gpt-5.5 light review on
refactor/upstream-bound vs upstream/dev. Address findings. Then Phase D.

* fix(api,logging,amp,handlers): address Codex Phase C review BLOCKER + IMPORTANT findings

Codex CLI gpt-5.5 light review on the BE diff (refactor/upstream-bound vs
upstream/dev) flagged 4 BLOCKERs and 4 IMPORTANTs that need addressing
before Phase C exit. This commit fixes all of them.

BLOCKER #1 — Management writes deadlock
  internal/api/handlers/management/handler.go
    applyConfigChange held h.mu while invoking the commit hook. The hook
    calls Server.UpdateClients which calls s.mgmt.SetAuthManager, and the
    old SetAuthManager re-took h.mu — Go's sync.Mutex is non-reentrant,
    so every mgmt config write hung. Fix: change Handler.authManager from
    a plain pointer to atomic.Pointer[coreauth.Manager] so SetAuthManager
    is lock-free; release h.mu before invoking the commit hook so the
    fan-out can re-enter handler methods without deadlocking. Reads
    scattered across the package always observed an unsynchronized
    h.authManager — the lock was paranoia, not correctness.

BLOCKER #2 — BaseAPIHandler.Cfg race
  sdk/api/handlers/handlers.go
  sdk/api/handlers/stream_forwarder.go
  sdk/api/handlers/gemini/gemini-cli_handlers.go
  sdk/api/handlers/openai/openai_images_handlers.go
    Server.cfgPtr was atomic but request handlers still read
    BaseAPIHandler.Cfg as a plain pointer that UpdateClients overwrote
    on every config hot reload. The plan mandated atomic snapshots for
    ALL request readers; this finishes that work. Cfg field replaced
    with cfgPtr atomic.Pointer[config.SDKConfig] + Config() accessor.
    All 14 internal h.Cfg reads migrated to h.Config().

BLOCKER #3 — Forced error logs after Close
  internal/logging/async_emitter.go
  internal/logging/async_emitter_test.go (new regression test)
    After Close drained the worker, subsequent forced enqueues landed in
    the buffered priority channel with no consumer left to drain them.
    The "forced error logs never drop" invariant was silently broken.
    Fix: closed atomic.Bool set BEFORE close(closeCh). enqueue checks
    the flag and returns syncFallback=true for force tasks (sync write)
    or drops normal tasks (with counter increment) post-close.
    TestAsyncEmitter_ForcedAfterCloseWritesSynchronously pins this.

BLOCKER #4 — AMP MultiSourceSecret.explicitKey race
  internal/api/modules/amp/secret.go
    UpdateExplicitKey wrote explicitKey under s.mu; Get read it without
    s.mu on every request. Fix: explicitKey becomes
    atomic.Pointer[string]. Get loads via atomic; UpdateExplicitKey
    stores via atomic. Cache reset still uses s.mu since cache writes
    are off the hot path.

IMPORTANT #5 — Stale-index pre-resolution in mgmt
  internal/api/handlers/management/handler.go
  internal/api/handlers/management/config_lists.go
    Several handlers resolved an index from one snapshot, then
    applyConfigChange cloned a (possibly newer) snapshot in which the
    pre-resolved index pointed at a different row. Fix: resolution moved
    INSIDE the applyConfigChange closure against the cloned config.
    applyConfigChange now detects c.Writer.Written() after fn returns —
    if the closure already wrote 404 or 400, the persist + commit + 200
    is skipped. Affected handlers: PatchGeminiKey, DeleteGeminiKey
    (api-key form), PatchClaudeKey, DeleteClaudeKey (api-key form),
    PatchOpenAICompat, PatchVertexCompatKey, DeleteVertexCompatKey
    (api-key form), PatchOAuthExcludedModels (removal path),
    DeleteOAuthExcludedModels, PatchOAuthModelAlias (removal path),
    DeleteOAuthModelAlias, PatchCodexKey, DeleteCodexKey (api-key form).

IMPORTANT #6 — UpdateClients fan-out not serialized
  internal/api/server.go
    UpdateClients read and wrote oldConfigYaml without serialization, so
    a file-watcher reload firing concurrently with a mgmt commit could
    compute deltas from the wrong "old" config. Fix: Server.updateMu
    serializes the entire UpdateClients body. Mgmt commit hooks no
    longer hold h.mu so they can take updateMu without deadlocking.

IMPORTANT #7 — yaml clone shape change
  internal/api/handlers/management/handler.go
    cloneConfigSnapshot used yaml.v3 round-trip and silently returned an
    empty &config.Config{} on marshal/unmarshal errors. yaml.v3 also
    canonicalised some nil maps/slices to empty values, changing JSON
    response shape after unrelated writes. Fix: switch to encoding/json
    round-trip (preserves nil-vs-empty per omitempty tags) and return
    error so applyConfigChange fails with a 500 instead of silently
    persisting an empty snapshot.

IMPORTANT #8 — errorLogsMaxFiles plain-int race
  internal/logging/request_logger.go
    Read by cleanupOldErrorLogs on the async writer goroutine; written
    by SetErrorLogsMaxFiles on hot reload. Fix: atomic.Int64; Set/load
    via atomic.

NIT #9 — logs_dropped_total metric not exposed
  Deferred — there's no metrics system in scope on this branch. The
  DroppedLogs() accessor is wired and the counter exists; surfacing
  it as a Prometheus/etc. metric belongs to a separate task.

Verification:
  - go build ./... clean.
  - go vet ./... clean.
  - All Phase A gating tests + new B3 regression test pass under -race
    (signature_cache_semantics, amp_race, amp_staleness, hot-reload,
    SSE chunk-boundary, async_emitter forced-after-close).
  - Pre-existing failures unchanged (gin SetMode global race in mgmt
    parallel tests, TestDefaultRequestLoggerFactory, TestCodexFreeModelsExcludeGPT55,
    TestEnsureAccessToken_WarmTokenLoadsCreditsHint,
    TestUpdateAntigravityCreditsBalance_LoadCodeAssistUserAgent — all
    fail on clean upstream-bound HEAD too).

* fix: address Codex Phase C re-review BLOCKER + IMPORTANT findings (round 2)

Round-2 review of commit e9b8bbe3 surfaced 3 new BLOCKERs introduced by
the round-1 fixes plus 2 IMPORTANTs (1 missed instance, 1 fresh observation).

BLOCKER #1 (round 2) — Mgmt commits could reorder
  internal/api/handlers/management/handler.go
    Round 1's deadlock fix released h.mu before the commit hook, which
    let two concurrent mgmt writes A and B interleave: A swap A unlock,
    B swap B commit (UpdateClients(B)), A commit (UpdateClients(A)) —
    rolling Server fan-out state back to A while disk holds B.
  Fix: hold h.mu through commit. Now safe because both Handler.SetConfig
    and Handler.SetAuthManager are lock-free atomic.Pointer stores;
    commit's transitive call into SetAuthManager no longer re-locks
    h.mu.

BLOCKER #2 (round 2) — JSON clone dropped json:"-" persisted fields
  internal/api/handlers/management/handler.go
    Round 1 swapped cloneConfigSnapshot's yaml round-trip for json
    round-trip to surface marshal errors and avoid yaml.v3 nil-vs-empty
    canonicalisation. But config.Config has Host, Port, RemoteManagement,
    AuthDir tagged `yaml:"..." json:"-"` so json round-trip dropped
    them; SaveConfigPreserveComments would persist a config with
    cleared Host/Port/auth-dir/remote-management. Every mgmt edit was
    silently zeroing critical config fields.
  Fix: revert to yaml round-trip. Keep the error-return signature so
    applyConfigChange still surfaces a 500 on marshal/unmarshal failure
    (the original IMPORTANT #7 concern), instead of silently producing
    an empty config.

BLOCKER #3 (round 2) — Async emitter close TOCTOU
  internal/logging/async_emitter.go
    Round 1's atomic.Bool was checked-then-acted-on without
    synchronization. A forced enqueue could load closed=false, get
    preempted, Close runs to completion (worker drains and exits), the
    enqueue resumes and sends into priority — buffered with no consumer
    left, dropped silently against the never-drop invariant.
  Fix: replace atomic.Bool + atomic-load with a sync.Mutex guarding
    enqueue's [check + send] and close's [flag transition + closeCh
    signal]. Lock held only for the brief send window; cost ~10-30 ns
    on the hot path.

IMPORTANT #5 (round 2 missed) — DeleteClaudeKey/Vertex/Codex no-match return 200
  internal/api/handlers/management/config_lists.go
    DeleteGeminiKey was correctly returning 404 when no api-key match
    was found. The other three by-api-key delete handlers fell through
    to sanitize + persist + 200 instead.
  Fix: mirror DeleteGeminiKey — return 404 from inside applyConfigChange
    when matchCount==0; the c.Writer.Written() check skips persist +
    commit + 200.

IMPORTANT #6 (round 2 fresh) — Multi-Config() reads not snapshotted
  sdk/api/handlers/handlers.go (ExecuteStreamWithAuthManager)
  sdk/api/handlers/gemini/gemini-cli_handlers.go (CLIHandler)
  sdk/api/handlers/openai/openai_images_handlers.go (ImagesGenerations, ImagesEdits)
    Several handlers called h.Config() multiple times. With BaseAPIHandler.cfgPtr
    behind atomic.Pointer, two reads in the same handler could observe
    different hot-reload snapshots and combine fields from old and new
    config.
  Fix: capture cfg := h.Config() once at handler entry and pass that
    snapshot through subsequent reads. Behavior preserved for any
    request that sees a hot reload mid-flight: it consistently sees
    pre-reload config for the duration of the handler.

Verification:
  - go build ./... clean.
  - All Phase A gating tests + B3 regression test pass under -race
    (signature_cache_semantics, amp_race, amp_staleness, hot-reload,
    SSE chunk-boundary, async_emitter forced-after-close).
  - Pre-existing failures unchanged.

* fix: address Codex Phase C round-3 review BLOCKER + IMPORTANT findings

Round-3 Codex review of commit 7ea1346f surfaced 4 NEW findings:
2 BLOCKERs (one architectural gap, one missed pattern in mgmt getters)
plus 2 IMPORTANTs (more delete-no-match-200 patterns, flush early-return).

BLOCKER R3-1 — Mgmt commits skip service-level fan-out
  internal/api/server.go (+SetManagementCommitter)
  sdk/cliproxy/service.go (override mgmt commit hook in Start)
    Round 1 wired the mgmt commit hook to call Server.UpdateClients,
    which only fans out to the api.Server's own state. The file watcher
    reload path additionally calls coreManager.SetConfig,
    SetOAuthModelAlias, applyRetryConfig, applyPprofConfig, selector
    rebuild on strategy/affinity changes, and rebindExecutors. Mgmt
    PUT/PATCH/DELETE was responding 200 but the change never reached
    the executor layer until the file watcher fired (or never, in
    embedded-no-watcher paths).
  Fix: add Server.SetManagementCommitter that overrides the mgmt
    handler's commit hook. In sdk/cliproxy/service.go's Start(), after
    the reloadCallback closure is defined, route mgmt commits through
    the SAME closure the file watcher uses, so both reload sources
    trigger identical fan-out. Embedded api.Server users without a
    Service still get the default UpdateClients hook (appropriate
    scope for embedded callers with no coreManager).

BLOCKER R3-2 — Mgmt getters mix snapshots across multiple h.cfg() calls
  internal/api/handlers/management/config_auth_index.go
    geminiKeysWithAuthIndex / claudeKeysWithAuthIndex /
    codexKeysWithAuthIndex / vertexCompatKeysWithAuthIndex /
    openAICompatibilityWithAuthIndex held h.mu (a vestige from when
    h.cfg was a mutable field) and called h.cfg() multiple times in
    sequence: len(h.cfg().X), range h.cfg().X, h.cfg().X[i]. With
    SetConfig now lock-free, a hot reload between those calls could
    return mismatched slices and panic or return mixed data.
  Fix: snapshot once (cfg := h.cfg()) at the top of each helper. Drop
    the spurious h.mu — it never protected cfg-reads against SetConfig
    swaps; it only serialised mgmt writes.

IMPORTANT R3-1 — More delete-no-match-200 patterns
  internal/api/handlers/management/config_lists.go
    Round-2 fixed the by-api-key form for Claude, Vertex, Codex, but
    siblings still persisted/committed and returned 200 when nothing
    matched. Affected handlers: by-api-key+base-url variants for
    Gemini/Claude/Vertex/Codex, by-name + by-index for OpenAI-compat,
    by-index for Gemini/Claude/Vertex/Codex, DeleteAmpUpstreamAPIKeys,
    DeleteAmpModelMappings, deleteFromStringList helper (api-keys,
    excluded models, etc.).
  Fix: each handler tracks len-before / len-after (filter form) or
    range-checks the index (idx form) and writes 404 + return when
    nothing was actually removed. applyConfigChange's
    c.Writer.Written() check skips persist + commit + 200.

IMPORTANT R3-2 — flush() returns before in-flight write completes
  internal/logging/async_emitter.go
    flush() polled only on channel lengths. Once the worker dequeued a
    task, both channels were empty and flush returned, even though
    writeLogRequest hadn't returned. Bench teardown could observe
    flush==done while the worker was still writing into the temp dir,
    racing TempDir cleanup.
  Fix: add asyncEmitter.active atomic.Int64. execute() increments at
    entry and decrements via defer at exit. flush() now waits for
    len(priority)==0 && len(normal)==0 && active.Load()==0 — covers
    both queued and in-flight tasks.

Verification:
  - go build ./... clean.
  - All Phase A gating tests + B3 + Async forced-after-close pass under
    -race (cache, amp, server hot-reload, refresh backoff, release URL,
    async emitter, mgmt key handlers).
  - Pre-existing failures unchanged.

* fix: address Codex Phase C round-4 review BLOCKER + IMPORTANT findings

Round-4 Codex review of commit 9e079689 surfaced 5 findings:
2 BLOCKERs (mgmt commits skipped watcher auth refresh; more multi-cfg()
read patterns in auth_files.go) plus 3 IMPORTANTs (flush dequeue race,
PatchAPIKeys 200-on-bad-index, committer hook ordering + atomic).

BLOCKER R4-1 — Mgmt commits skipped watcher auth refresh
  internal/watcher/dispatcher.go (export RefreshAuthState)
  sdk/cliproxy/types.go (WatcherWrapper.RefreshAuthState)
  sdk/cliproxy/watcher.go (wire refreshAuthState into the wrapper)
  sdk/cliproxy/service.go (buildReloadCallback calls watcher.RefreshAuthState)
    The watcher's reloadClients does TWO things: reloadCallback(cfg) and
    refreshAuthState(force). Round-3 wired only reloadCallback into the
    mgmt commit hook, so config-defined API key auths weren't
    re-synthesized or dispatched until the file watcher noticed the
    saved YAML. A DELETE/PUT of a Gemini/Claude/etc. key returned 200
    while core auth state remained stale.
  Fix: WatcherWrapper exposes RefreshAuthState. Service's
    buildReloadCallback (now the shared closure used by both file
    watcher and mgmt committer) calls watcher.RefreshAuthState(true)
    after every config swap so auth state stays in lockstep with the
    config swap.

BLOCKER R4-2 — More multi-cfg() reads in auth_files.go
  internal/api/handlers/management/auth_files.go
    Three helpers called h.cfg() multiple times in sequence:
    managementCallbackURL (Port + TLS.Enable + Port), listAuthFilesFromDisk
    (AuthDir twice — list and per-file join), DeleteAuthFile?all=true
    (AuthDir twice — list and per-file join). Hot reload between calls
    could mix Port from snapshot N with TLS from N+1, or list one
    AuthDir then delete from a different one.
  Fix: snapshot cfg := h.cfg() once at handler entry; use
    snapshot-derived values for all subsequent reads.

IMPORTANT R4-1 — flush had dequeue-to-execute race
  internal/logging/async_emitter.go
    Round-3 added active counter incremented inside execute(). But the
    worker dequeues from priority/normal channel BEFORE calling
    execute. In that window, the channel had emptied (decremented by
    receive) and active was still 0. flush() observed both 0 and
    returned, even though the task hadn't yet entered execute.
  Fix: rename active -> pending. Increment inside enqueue's
    lock-protected channel send (so it's incremented before the worker
    can possibly receive). Decrement in execute's defer. flush waits
    for pending == 0; the lock-protected enqueue->send and the
    deferred decrement enclose every code path where a task is in
    flight.

IMPORTANT R4-2 — PatchAPIKeys index out-of-range returned 200
  internal/api/handlers/management/config_lists.go
    patchStringList's index path silently no-op'd when the index was
    out of range, persist + commit + 200 — same pattern as the
    delete-no-match-200 family fixed in round-3, just on the patch
    helper.
  Fix: write 404 + return from inside the closure when the index is
    out of range. applyConfigChange's c.Writer.Written() check skips
    persist/commit/200.

IMPORTANT R4-3 — Committer hook installed after server starts; not atomic
  sdk/cliproxy/service.go (hoist reloadCallback + SetManagementCommitter
    BEFORE server.Start)
  internal/api/handlers/management/handler.go (commitPtr atomic.Pointer)
    Two related issues: (a) the HTTP server began serving before
    SetManagementCommitter ran, leaving a window where mgmt writes used
    the default partial Server.UpdateClients hook. (b) the commit field
    was a plain func without synchronization, racing applyConfigChange
    reads.
  Fix: extract the inline reloadCallback closure into Service.buildReloadCallback;
    install it BEFORE the server.Start goroutine fires. Replace
    Handler.commit with commitPtr atomic.Pointer[commitFn]; reads via
    loadCommit() helper, writes via lock-free atomic store. Safe to
    set at any lifecycle point.

Verification:
  - go build ./... clean.
  - All Phase A gating tests + B3 + Async forced-after-close pass under -race.

* fix: address Codex Phase C round-5 review BLOCKER + IMPORTANT (partial)

Round-5 Codex review of commit 94a9500c surfaced 4 findings (2 BLOCKERs +
2 IMPORTANTs). Partial fix: addresses 3 of 4 fully and 1 partially. Two
remaining helpers in oauth_callback.go and auth_files.go (OAuth flow long
goroutines) are deferred for a separate follow-up.

BLOCKER R5-1 — Watcher RefreshAuthState used stale config
  sdk/cliproxy/service.go
    Round-4 wired RefreshAuthState into the mgmt commit hook but the
    watcher's cached config wasn't updated first. RefreshAuthState
    synthesizes from w.config, so mgmt writes refreshed auths from the
    OLD API-key snapshot until the file watcher fired.
  Fix: call s.watcher.SetConfig(newCfg) BEFORE
    s.watcher.RefreshAuthState(true) in buildReloadCallback.

BLOCKER R5-2 (partial) — Multi-cfg snapshot reads in mgmt helpers
  internal/api/handlers/management/logs.go
  internal/api/handlers/management/vertex_import.go
  internal/api/handlers/management/api_tools.go
    Three mgmt files had handlers that called h.cfg() multiple times in
    sequence (logs.go: nil-check + LoggingToFile + h.logDirectory();
    vertex_import.go: nil-check + AuthDir; api_tools.go: per-key proxy
    lookup + global proxy fallback). All three now snapshot
    cfg := h.cfg() once and pass through. logs.go's logDirectory helper
    now takes *config.Config explicitly.
  Remaining (deferred): oauth_callback.go and auth_files.go OAuth flow
    handlers also exhibit this pattern but the fix requires plumbing cfg
    snapshots through long-running goroutines that wait for callback
    files. Deferred for a focused follow-up commit; behavior under the
    current pattern remains the upstream-default for those flows.

IMPORTANT R5-1 — Async pending counter race after channel send
  internal/logging/async_emitter.go
    Round-4 incremented pending AFTER the channel send, leaving a
    window where a worker could receive and enter execute() while
    pending was still 0 — flush() observed pending==0 and returned
    even though a write was in flight.
  Fix: increment pending BEFORE the select{} send. On full-channel
    paths roll back with pending.Add(-1). The increment + send happen
    inside closeMu so close() observes pending consistently.

IMPORTANT R5-2 — PatchAmp{ModelMappings,UpstreamAPIKeys} 200 on no-op
  internal/api/handlers/management/config_lists.go
    Both endpoints accepted {} or empty/all-empty `value` payloads,
    persisted unchanged config, and returned 200.
  Fix: validate `value` has at least one usable entry before
    applyConfigChange; return 400 with a descriptive error otherwise.

Verification:
  - go build ./... clean.
  - All Phase A gating tests + B3 + Async forced-after-close pass under
    -race.

* docs(bench): document Phase C exit punch list

Five Codex CLI gpt-5.5 review rounds were run against the BE diff
refactor/upstream-bound vs upstream/dev. Each round closed every
BLOCKER/IMPORTANT it flagged at that round (rounds 1-4 fully, round 5
3.5/4). The loop didn't fully converge — Codex kept finding deeper
pre-existing race conditions and API contract bugs that surfaced once
the perf refactor exposed them — but each round's count was dropping
(9 → 5 → 4 → 5 → 4) and the original Phase C goals (atomic AMP,
async logging, atomic Server config, clone-modify-persist-swap mgmt,
LRU cache, exponential refresh backoff, ReleaseURLProvider seam) are
all delivered, tested under -race, and meeting plan bench targets.

Per user direction, Phase C exits with this punch list rather than
continuing the review loop indefinitely. The deferred item is OAuth
flow goroutine multi-cfg() reads (low real-world risk: OAuth login
is operator-driven and rarely overlaps with hot-reload). Pattern is
established; fix is mechanical when convenient.

Adds:
  bench/phase-c/PUNCH_LIST.md — full deferred-items + future-watch
    list, plus a roll-up of every round-1-through-round-5 finding
    that IS fixed, plus Codex session IDs for traceability.
  bench/phase-c/SUMMARY.md — points at PUNCH_LIST.md and notes
    Phase D is next.

* perf(logging): switch async emitter normal queue to drop-oldest

Codex Phase C round-6 review NIT #4: plan §Behavior Contract said
"may drop oldest non-priority entries on queue overflow" but the
implementation dropped newest. Under the existing closeMu critical
section, do a non-blocking receive to discard the oldest then send
the new task. Loop bound is at most 2 iterations; closeMu excludes
other producers so the second send is guaranteed to succeed because
workers only consume the channel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(api): flush async request logger on Server.Stop

Codex Phase C round-6 review IMPORTANT #2: Server.Stop shut down the
HTTP server but never drained the async request logger. Plan §Phase C
mandates "Flush/Close on graceful shutdown" so queued normal logs
write to disk and forced-error logs are flushed before return. Type-
assert the requestLogger interface to interface{ Close() } and call
it after s.server.Shutdown succeeds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf(cache): bound outer signature cache group map at MaxGroupCount

Codex Phase C round-6 review IMPORTANT #3: the inner per-group LRU was
bounded at SignatureGroupCapacity (10K) but the outer sync.Map of
groups was unbounded. GetModelGroup falls back to the raw model name
for unknown providers, so a workload with many unique unknown model
names could spike memory until the 10-minute purge ran.

Cap the group count at MaxGroupCount (64) with LRU-by-write-time
eviction. Eviction runs on the cold path only (new-group inserts at
the cap) under groupEvictMu so concurrent inserts evict at most one
extra group than necessary.

The read path stays sync.Map.Load fast-path: lastAccess is updated on
writes only, not reads. Updating lastAccess on every read added per-
call atomic.Int64 contention that pushed Get_Hit_Parallel past the
±5% bench target. The LRU policy degrades to "least recently written
group" — fine because production model surfaces stay well under the
64-group cap and eviction effectively never fires.

Add TestSignatureCache_OuterMapBoundedByMaxGroupCount to pin the cap
and the LRU-by-write eviction order.

Bench (Phase C exit -> round 6):
  Get_Hit:           219.5 ns -> 219.2 ns (-0.1%)
  Get_Hit_Parallel:  315.3 ns -> 264.9 ns (-16%, faster)
  Set:               592.9 ns -> 541.8 ns (-8.6%, faster)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cliproxy): snapshot config in goroutine-callable helpers

Codex Phase C round-6 review BLOCKER #1: buildReloadCallback writes
s.cfg under cfgMu.Lock, but ensureExecutorsForAuthWithMode (376),
registerModelsForAuth (851), resolveConfig*Key (1118-1232), and
oauthExcludedModels (1258) all read s.cfg directly without taking
cfgMu.RLock. Phase C #23 (commit 662672c1) routed mgmt commits
through buildReloadCallback so this race fires per-mgmt-PUT, not
just on the rare file-watcher reload path.

Add Service.configSnapshot() accessor that returns the cfg under
cfgMu.RLock. Have ensureExecutorsForAuthWithMode, registerModelsForAuth,
and oauthExcludedModels snapshot once at function entry and pass cfg
to nested helpers. Convert resolveConfigClaudeKey, resolveConfigGeminiKey,
resolveConfigVertexCompatKey, and resolveConfigCodexKey from methods on
Service to package-level functions taking cfg *config.Config so the
caller's per-call snapshot covers the full resolution.

Add TestService_ConfigSnapshot_RaceFree to pin the invariant under
go test -race.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(api): add mgmt-handlers hot-reload race test

Codex Phase C round-6 review NIT #5: the existing
TestServer_ConfigHotReload_NoRaceUnderConcurrentReads only exercised
Server.Config() reads, /management.html serving, and
Server.UpdateClients writes — it did not drive real mgmt handlers
through applyConfigChange + Handler.cfgPtr swap + commit hook.

Add TestServer_MgmtHandlers_HotReloadRace driving PutDebug,
PutAmpUpstreamURL, and PutAmpModelMappings directly against the
Handler concurrently with Server.Config() readers. Each handler enters
applyConfigChange — clones the snapshot, mutates, persists, atomic-
swaps Handler.cfgPtr, fires the commit hook (which does
Server.UpdateClients fan-out and re-binds executors). The test bypasses
the secret-key middleware (the middleware is not under test; the swap
chain is) and seeds a minimal config.yaml so persist succeeds and the
swap fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(bench): record Phase C round-6 closure

Mark all 5 round-6 findings (1 BLOCKER, 2 IMPORTANT, 2 NIT) as fully
fixed and verified under -race. The only remaining deferred punch-list
item is the OAuth-flow goroutine multi-cfg snapshot read pattern (low
real-world risk, mechanical fix when convenient). Round 6 session ID
recorded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(executor): extract antigravity payload-builder; add round-1 review tests

Phase D code quality (plan §"Stage 1 — Phase D: targeted code quality"):
extract the schema-sanitization + model-transform branch from
buildRequest into antigravityPayloadBuilder. The original buildRequest
had 5 nested conditionals plus duplicated claude-vs-gemini logic in
both branches; the builder dedups the dialect choice and the final
transform into applyModelTransformString / applyModelTransformBytes
(byte and string variants share applyModelTransform semantics).

No behavior change: the 4 existing TestAntigravityBuildRequest_*
tests pass under -race. New TestAntigravityPayloadBuilder_AllQuadrants
table-driven test covers {sanitized, plain} x {Claude, Gemini} with
byte-equivalence assertions on the wire body and the requestLog
copy. TestAntigravityPayloadBuilder_RequestLogOff_NoLogBytes pins the
no-log-allocation invariant when cfg.RequestLog is false.
TestAntigravityPayloadBuilder_UseAntigravitySchema covers the dialect
dispatch (claude + gemini-3-pro family vs other gemini).

Codex Phase D round-1 review IMPORTANT #1 addressed by the new tests
gating all 4 quadrants of the extracted branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(stage1-exit): address Codex Stage 1 exit review BE findings

Codex Stage 1 exit full-diff review (refactor/upstream-bound vs
upstream/dev) returned 4 IMPORTANT + 1 NIT on the BE side. All 5 are
now closed.

BE-1 (IMPORTANT) — request_logger ownership clone:
  internal/logging/request_logger.go LogRequest now defensively clones
  caller-owned []byte / map[string][]string / *interfaces.ErrorMessage
  slices before enqueueing into the async emitter. The sync→async
  switch in Phase C silently changed the public ownership contract
  (FileRequestLogger is re-exported via sdk/logging); cloning preserves
  the call-time snapshot so a caller reusing buffers after LogRequest
  returns cannot corrupt the eventual log write. Trade-off: ~1 µs
  added per call vs the prior 35 ns optimum (still ~230× faster than
  the pre-Phase-C synchronous baseline). Sync fallback path keeps the
  no-buffer optimum. New TestAsyncEmitter_CallerMutateAfterEnqueue_
  DoesNotCorruptLog gates the invariant.

BE-2 (IMPORTANT) — signature cache outer-map cap race:
  internal/cache/signature_cache.go getOrCreateGroupCache now
  serialises the cold-miss insert + LRU eviction under groupEvictMu so
  the cap holds exactly under concurrent first-time inserts. New
  groups initialise lastAccess at construction (newGroupCache(cap, now))
  so a fresh group is not the LRU eviction target before its first
  set/get fires. New TestSignatureCache_OuterMapCap_ParallelInserts
  pins the invariant under -race.

BE-3 (IMPORTANT) — RefreshAuthState force flag:
  sdk/cliproxy/service.go buildReloadCallback now passes false to
  watcher.RefreshAuthState. The previous force=true fired modify
  events for every unchanged auth on debug/request-log mgmt writes,
  causing spurious applyCoreAuthAddOrUpdate / model registration /
  scheduler refresh / hook fan-out. With force=false, the watcher's
  diffing logic only emits add/modify/delete events when an auth's
  synthesized state actually changed. The file-watcher reload path's
  own forceAuthRefresh flag (computed from cfg.ForceModelPrefix /
  OAuthModelAlias / retry-config diffs) is preserved unchanged.

BE-4 (IMPORTANT) — mgmt test 200 → 404 contract:
  test/amp_management_test.go TestDeleteAmpModelMappings_NonExistent
  was still asserting status 200 for delete-non-existent. Updated to
  assert StatusNotFound, matching the Phase C round 3-4 contract
  change ("Delete-no-match returns 404 explicit not-found, not silent
  no-op 200"). Comment expanded.

BE-NIT — bench artifact whitespace:
  bench/baseline/backend.txt and bench/phase-c/backend.txt had
  trailing whitespace on three lines and a blank line at EOF, both
  flagged by `git diff --check`. Trimmed.

Validation:
  - go build ./...
  - go vet ./...
  - go test -race ./internal/cache ./internal/logging
    ./internal/api/modules/amp ./sdk/cliproxy ./test (all pass)
  - bench: signature cache Get_Hit_Parallel 269.5 ns/op (vs Phase C
    exit 315.3, -15%); LogRequest ~1 µs/op (regression vs Phase C
    exit 35 ns, accepted for clone-correctness)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(stage1-exit): address Codex Stage 1 exit round 2 BE findings

Codex Stage 1 exit round 2 review returned 2 IMPORTANT on the BE side.
Both are now closed.

BE-R2-1 (IMPORTANT) — deep-clone ErrorMessage internals:
  internal/logging/request_logger.go cloneErrorMessages previously
  shallow-copied *interfaces.ErrorMessage; the inner Error and Addon
  http.Header remained shared with the caller. The async writer later
  calls Error.Error() and reads Addon, so a caller mutating those
  after LogRequest could corrupt or race the eventual log. Snapshot
  now: copy StatusCode by value, freeze the inner Error to its current
  text via errors.New(e.Error.Error()), and deep-clone the Addon
  header map. Extended TestAsyncEmitter_CallerMutateAfterEnqueue_
  DoesNotCorruptLog to also pass apiResponseErrors with a mutable
  inner error and assert post-enqueue mutation does not leak to the
  log file.

BE-R2-2 (IMPORTANT) — mgmt-path force predicate parity:
  My round 1 BE-3 fix passed force=false unconditionally to
  watcher.RefreshAuthState in service.go buildReloadCallback. That
  closed the spurious-modify-event path for unrelated mgmt writes
  (PutDebug, PutRequestLog) but dropped the file-watcher path's
  forceAuthRefresh semantic for edits that change ForceModelPrefix /
  OAuthModelAlias / retry-config — fields that affect model
  registration but are NOT part of the synthesized auth state. Mgmt
  edits to those fields update s.cfg + coreManager but no auth modify
  event fired, so the global model registry could stay stale.

  Now buildReloadCallback snapshots oldCfg under cfgMu before the
  swap and computes the same forceAuthRefresh predicate as
  internal/watcher/config_reload.go (forceAuthRefresh = oldCfg !=
  nil && (ForceModelPrefix change || !DeepEqual(OAuthModelAlias) ||
  retryConfigChanged)), then passes that bool to RefreshAuthState.
  The file-watcher reload path's redundant call still observes
  currentAuths==newState afterwards, so its own forceAuthRefresh
  computation either re-applies the same force (no-op if already
  emitted) or skips. PutDebug / PutRequestLog stay force=false.

Validation:
  - go build ./..., go vet ./...
  - go test -race ./internal/cache ./internal/logging ./sdk/cliproxy
    ./test (all pass)
  - Extended async_emitter_test now covers ErrorMessage / Addon
    mutation guard

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(stage1-exit): close round-3 BE IMPORTANT; document Stage 1 NITs

Closes Codex Stage 1 exit round 3 BE IMPORTANT BE-R3-1 and accepts
the round-3 NITs (per user decision: option 2 — fix IMPORTANTs only,
defer NITs to PUNCH_LIST).

BE-R3-1 (IMPORTANT) — buildReloadCallback now serialised end-to-end:
  Added Service.reloadMu sync.Mutex and lock the entire body of the
  closure returned from buildReloadCallback. Previously only the
  cfgMu.RLock window protected oldCfg capture; the rest of the
  callback (server update, s.cfg swap, coreManager updates, executor
  rebind, watcher SetConfig + RefreshAuthState) ran lock-free, so a
  file-watcher callback overlapping a mgmt commit could:
    (a) compute a stale forceAuthRefresh predicate (oldCfg captured
        from a config the OTHER callback already overwrote)
    (b) overwrite s.cfg out-of-order so a newer callback's snapshot
        is followed by an older callback's swap
  reloadMu eliminates both.

PUNCH_LIST documentation:
  - Records that 6 Phase C rounds + 3 Stage 1 exit rounds all reached
    "no BLOCKER, no IMPORTANT" state on the closed surface.
  - Documents the 4 round-3 NITs as deferred long-tail work with
    fix shapes and effort estimates:
    * BE-R3-2: cloneErrorMessages Addon clone needs unit-level test
    * FE-R3-2: virtualised path loses inter-card gap
    * FE-R3-3: isAbortError should move to src/utils/errors.ts
    * FE-R3-4: virtualisation threshold of 50 is unverified

Validation:
  - go build ./..., go vet ./...
  - go test -race ./internal/cache ./internal/logging ./sdk/cliproxy
    ./test (all pass)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: remove BE bench artifacts and baseline-backend.sh

Phase A bench scaffolding has served its gating purpose for Phase C
hot-spot work. The committed bench/baseline/backend.txt and
bench/phase-c/* artifacts no longer drive any active workflow:

  - bench/baseline/backend{.txt,-meta.json}: pre-Phase-C baseline
    for the Phase A→B→C gating compare. Phase C exit benches were
    already documented in bench/phase-c/{SUMMARY,PUNCH_LIST}.md.
  - bench/phase-c/{SUMMARY.md, PUNCH_LIST.md, backend.txt}: Phase C
    exit summary + the rolling Codex review punch-list. The
    PUNCH_LIST itself is now historical: 6 Phase C rounds + 3 Stage 1
    exit rounds all reached "no BLOCKER, no IMPORTANT" closure on
    the surface they covered, and the deferred items captured there
    can live in commit history / the next round's review notes.
  - scripts/baseline-backend.sh: bench harness that wrote into
    bench/baseline/backend.txt. Without the bench tree there is
    nothing for it to write.

Removing these does not affect runtime or test behavior — no Go
code references the bench dir, and `go build ./...` stays clean.
The FE side keeps bench/baseline/frontend.json + scripts/baseline-
{frontend.sh,update.cjs}: those are actively used via
`npm run baseline:{capture,update}` after the round-3 freshness-
check fix and continue to gate FE bundle/perf regression checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* stage2: apply fork-only backend integration

- extract PromptRule into its own file to keep the fork boundary small
- wire the fork release URL provider through the Stage 1 seam
- adapt prompt-rules writes to the atomic config persistence flow
- add fork-boundary governance docs and a local guard hook

* fix(management): return live config snapshot and stabilize auth manager access

- return the current config snapshot from GetConfig instead of a zero-value allocation
- snapshot the auth manager once per multi-step auth-file operation so a hot-reload swap cannot split lookup and update across different manager instances

* docs: centralize assistant guidance

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant