feat: background git fetch for connected repos (auto-refresh v1)#213
Conversation
Merge sequence — DO NOT MERGE BEFORE #212Blocked on #212 ( WhyThis PR's
The orphan-GC logic in Conflict status
Plan after #212 merges
|
|
Briefly scanned Currently Also still waiting on the rebase for #211 — feel free to do them in either order or both at once. git fetch origin
git rebase origin/main
git push --force-with-lease |
Adds per-project "Auto-fetch on new chat" toggle, a manual refresh
button on each project card, and a global "Fetch all on codbash start"
toggle. Runs `git fetch --all --prune` in the background — never touches
the working tree, never blocks the HTTP server.
Goal: when a session starts, the LLM sees current origin/<branch> so
new branches start from a fresh base and continuations don't pile on
top of stale state.
Triggers:
- Manual: click the refresh button on a project card.
- New chat: when the per-project toggle is on, /trigger + /wait
(timeoutMs: 2000) fire before /api/launch. Session opens after fetch
finishes, or after 2s with stale refs (graceful degradation).
- Service start: bin/cli.js wires the known-roots gate then calls
repoRefreshManager.initOnStartup() on process.nextTick.
Backend:
- src/repo-refresh.js — singleton via createRepoRefreshManager(opts);
state machine idle <-> fetching <-> error, single-flight per gitRoot,
semaphore max 4, 60s timeout with SIGTERM -> 2s grace -> SIGKILL.
- src/repo-refresh-routes.js — 4 routes under /api/repo-refresh/*
(state, trigger, wait, settings). Body cap 1 MiB, /wait timeoutMs
clamped to 10s, asyncHandler sends 500 on uncaught throws.
- src/atomic.js — atomicWriteJson(path, obj, {mode}) — tmp + fsync +
rename with cleanup on rename failure. Settings written 0o600;
existing disk caches in data.js retrofitted (closes deferred MEDIUM
from PR #212 about non-atomic writes).
- Persistence: ~/.codedash/refresh-settings.json. Corrupt file ->
defaults + warning, file left untouched. Debounced 500ms saves.
- Known-roots gate: getKnownGitRoots() = loadProjects() U session
git_roots, 5s TTL cache. Wired into manager via
setKnownGitRootsProvider(); initOnStartup refuses to fetch paths
not in the known set. Defends against settings-file injection.
- Credential redaction: https://user:token@host stripped from any
captured stderr before storing in lastError (which is exposed via
/state to the browser).
Frontend:
- Per-project card: status badge (Fetching / Updated 2 min ago /
Refresh failed: <msg>), refresh button (28x28 visual, 44x44 hit
area via ::after), Auto-fetch toggle. Header carries global
"Fetch all on codbash start" toggle.
- Polling 2s only while a visible repo is fetching; time-ticker 30s
re-renders relative timestamps. Both skip the swap if focus is
inside the slot (no focus theft).
- Optimistic toggle with rollback + 3s-deduplicated toasts. Input
disabled during inflight POST (closes the toggle race risk).
- new-chat hook shows a spinner on the launch button + aria-busy.
Re-entrancy guard via dataset.rrLaunchInflight.
- A11y: native checkbox (no role="switch"), role="group" wrapper,
role="status" + aria-live="polite" on the outer badge slot so
innerHTML swaps don't tear down the live region, aria-describedby
-> visually-hidden span with full error text for SR/keyboard.
prefers-reduced-motion disables the spinner animation.
Tests: 34 new tests across atomic.js, repo-refresh.js, the routes,
and explicit cases for credential redaction, known-roots gating,
and provider injection.
Deferred (documented LOW from review pass 2, separate follow-ups):
- Retrofit pre-existing readBody to enforce body size cap across
16 callers (out of scope for this feature; chore PR).
- Periodic scheduler (5/10/15/30/60 min intervals).
- Page-refresh bulk trigger from the browser.
- "Behind by N commits" indicator.
- Connection-lost banner when poll fails.
- toast queue/stack (current dedupe is sufficient for v1).
af9cb0c to
24d6fea
Compare
vakovalskii
left a comment
There was a problem hiding this comment.
LGTM ✅ Re-reviewed after rebase.
Verified locally:
node -cclean on all 6 touched/new backend files (repo-refresh.js,repo-refresh-routes.js,atomic.js,server.js,data.js,bin/cli.js)- Spot-checked the security posture against yesterday's pre-rebase reading: dependency injection for testability, credential redaction in
lastError(/(https?:\/\/)[^/\s:@]+:[^/\s@]+@/), known-roots gate oninitOnStartup, SIGTERM→2s grace→SIGKILL, atomic settings persistence — all intact
One non-blocking follow-up to flag (not a merge blocker):
When I run node --test test/repo-refresh.test.js locally, tests 1–11 pass but 12–16 are reported cancelled (not failed). Each cancelled test passes in isolation via --test-name-pattern. Looks like one of the earlier subtests leaks a setTimeout or unresolved promise that survives into the next test's window. The two most likely candidates are initOnStartup (test 11) and the debouncedSave timer in updateSettings (test 7) — both schedule timers that may outlive their assertions. Worth a quick pass with afterEach cleanup or a manager-level shutdown() method that cancels pending timers.
A more important observation that surfaced during this dig: the CI workflow .github/workflows/ci.yml only runs node bin/cli.js version/help and a node -c syntax check on 5 files — it does not invoke node --test. So "CI green 6/6" on this PR means no test was actually run. Worth a short follow-up PR to add - name: Tests\n run: node --test test/*.test.js to the workflow so future regressions are caught.
Both of these are housekeeping and don't gate this merge — the production code itself is solid. Thanks @NovakPAai!
* fix(repo-refresh): cancel manager-owned timers in test teardown + run tests in CI Addresses two non-blocking follow-ups from #213 review. Test isolation: createRepoRefreshManager() now tracks every setTimeout it schedules (debounced save, runFetch 60s timeout/SIGKILL grace, waiter timeouts) in a Set and exposes shutdown() to cancel them all. Earlier reliance on timer.unref() was enough to let the process exit, but node:test's resource tracker still observed the pending handle and intermittently reported tests 12-16 as cancelled (per maintainer's report). Tests now register shutdown() via t.after() through a makeMgr(t, opts) helper — no more leak. CI: workflows/ci.yml previously ran only `node bin/cli.js version|help` and `node -c` syntax checks on five files, so "CI green" meant no test ever ran. Adds `node --test test/*.test.js` so future regressions are caught. wsl-windows.test.js skipped on non-win32: the assertion under test relies on backslash path joining which only happens when path defaults to win32. The companion "returns empty on non-Windows" test still validates the cross-platform branch. * fix(repo-refresh): resolve pending fetch promises on shutdown Following up on the CI failure for the first attempt at this fix. macOS runners on node:test reported tests 12–16 as cancelledByParent with the error "Promise resolution is still pending but the event loop has already resolved", even though all timers were now cleared. Root cause: test 11 ("initOnStartup garbage-collects orphan perProject entries") triggers a fetch for /repos/alive via initOnStartup but never resolves the mocked execFile call. shutdown() cancelled the 60s timeoutTimer but left the inflight Promise hanging — stricter node:test hosts (macOS CI) flag this at test teardown, while my local machine tolerated it. Track each runFetch finalize callback in `activeFinalizers` and resolve every outstanding fetch with a synthetic "cancelled (manager shutdown)" error state when shutdown() runs. Tests now pass cleanly on the same runners that failed before. * fix(repo-refresh): drop .unref() on awaited timers + thread shutdown into api tests CI second run still failed tests 12–16 with "Promise resolution is still pending but the event loop has already resolved". Root cause is more fundamental than a missing cleanup hook: The four manager timers (runFetch timeoutTimer/killTimer, waiter `to`, debounced saveTimer) were all .unref()'d. unref'd timers are not counted as keeping the event loop alive — so when a test does `await mgr.waitForRefreshOrTimeout(..., 30)`, the only handle keeping the loop alive is the unref'd 30ms timer. node:test sees the loop drain to idle while the test's returned Promise is still pending and flags it. Now that shutdown() exists, .unref() is redundant for graceful host exit. Drop all four .unref() calls so the timers properly keep the event loop alive while in flight; hosts call shutdown() to release them. Also thread t.after(shutdown) through repo-refresh-api.test.js's makeDeps(t, ...) helper — that file ran before repo-refresh.test.js in the CI test order and was leaving timers across the file boundary. --------- Co-authored-by: jackrescuer-gif <jackrescuer@gmail.com>
Summary
Adds a background `git fetch --all --prune` worker for connected repositories so a new chat session starts on fresh `origin/` refs. Three triggers (manual button, new-chat click, service start), per-repo single-flight + semaphore max 4, 60s timeout with SIGTERM→SIGKILL grace, atomic settings persistence.
The working tree is never touched — the user decides when to merge.
Artefacts
What ships
Backend (`src/repo-refresh.js`, `src/repo-refresh-routes.js`, `src/atomic.js`)
Frontend (`src/frontend/app.js`, `src/frontend/styles.css`)
Tests
34 new tests:
All 34 pass. Pre-existing `test/wsl-windows.test.js` failure on macOS is unrelated (#212).
Reviews
Two parallel review passes (code-reviewer, security-reviewer, UX/UI-reviewer with `vercel-web-design-guidelines` + `ux-designer` skills loaded). All CRITICAL/HIGH/MEDIUM findings resolved in-PR. Selected LOWs deferred with documented reason (see commit body).
Deferred to follow-ups
Test plan