Conversation
- Integrated Rolldown for hot module replacement (HMR) in `arkor dev`, allowing real-time updates to the training interface without page refresh. - Implemented `requestEarlyStop` and `replaceCallbacks` methods in the Trainer API to facilitate graceful stopping of training jobs and dynamic callback updates during execution. - Updated documentation to reflect new features and usage patterns for improved developer guidance. - Adjusted cleanup logic for better resource management during development sessions.
…l shutdown - Introduced `replaceCallbacks` method in the Trainer API to allow dynamic updates of lifecycle callbacks during training runs. - Enhanced signal handling for graceful early stopping, ensuring in-flight checkpoints are preserved during HMR rebuilds. - Added `registerCleanupHook` for streamlined resource management on process exit, improving cleanup logic across development commands. - Updated documentation to reflect new features and usage patterns for better developer guidance.
…internal callback swapping - Removed the public `replaceCallbacks` method from the Trainer interface to prevent exposure of the hot-swapping functionality. - Introduced an internal mechanism for callback swapping using a `Symbol.for`-keyed brand, allowing for dynamic updates during training runs without affecting the public API. - Updated signal handling to ensure seamless integration with the new callback swapping logic, enhancing the hot module replacement (HMR) experience. - Revised documentation to reflect changes in the Trainer API and clarify the internal callback management process.
…lement internal early-stop handling - Removed the public `requestEarlyStop` method from the Trainer interface to prevent exposure of the early-stop functionality. - Introduced an internal mechanism for early stopping using a `Symbol.for`-keyed brand, allowing for graceful stopping after the next checkpoint without affecting the public API. - Updated signal handling to ensure seamless integration with the new early-stop logic, enhancing the hot module replacement (HMR) experience. - Revised documentation to reflect changes in the Trainer API and clarify the internal early-stop management process.
…llback replacement - Updated the Trainer API to remove public exposure of `requestEarlyStop` and `replaceCallbacks` methods, enhancing encapsulation. - Implemented internal mechanisms for early stopping and callback swapping using `Symbol.for`-keyed brands, ensuring seamless integration during training runs. - Revised signal handling to improve the hot module replacement (HMR) experience and maintain clean resource management. - Updated documentation to reflect these changes and clarify the internal management processes for developers.
- Replaced esbuild with Rolldown for building the project, ensuring external dependencies are resolved correctly. - Implemented a cleanup hook system to manage resource disposal on process termination signals. - Enhanced the development experience with hot module replacement (HMR) capabilities, allowing for seamless updates during training runs. - Updated documentation to reflect changes in the development loop and HMR behavior. - Added tests for new signal handling and configuration hashing functionality.
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
| import { ensureProjectState } from "../core/projectState"; | ||
| import { readState } from "../core/state"; | ||
| import { readManifestSummary } from "./manifest"; | ||
| import { readManifestSummary, summariseBuiltManifest } from "./manifest"; |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8abc594369
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const hotSwapTargets = activeTrains.notifyCallbackReload(nextHash); | ||
| const restartTargets = | ||
| activeTrains.requestEarlyStopOnMismatch(nextHash); |
There was a problem hiding this comment.
Dispatch train signals only once per rebuild
notifyCallbackReload/requestEarlyStopOnMismatch run inside each /api/dev/events subscriber callback, so one rebuild sends signals once per connected SSE client (and again on reconnect when createHmrCoordinator.subscribe replays lastEvent). In the common case of two open Studio tabs, the same child can receive two SIGTERMs for one config-changing rebuild; the second signal triggers the runner’s forced-exit path (exit(143)) instead of checkpoint-preserving early stop. Move signal dispatch out of per-subscriber delivery (or dedupe per rebuild hash) so each child is signaled at most once per rebuild event.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR upgrades the local Studio dev experience by migrating the trainer bundling pipeline from esbuild to Rolldown, adding an HMR/SSE channel to propagate rebuild events to the Studio SPA, and introducing signal-based “hot-swap callbacks vs early-stop + restart” behavior for in-flight training subprocesses.
Changes:
- Replace esbuild-based trainer bundling with Rolldown (one-shot build + watch/HMR).
- Add
/api/dev/eventsSSE stream + SPA client wiring to refresh manifest and coordinate restarts/hot-swaps. - Introduce internal trainer inspection/callback replacement/early-stop hooks (via
Symbol.forbrands), plus cleanup utilities and updated docs/tests.
Reviewed changes
Copilot reviewed 27 out of 28 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Removes esbuild and adds rolldown to the lockfile graph. |
| packages/arkor/package.json | Swaps dependency from esbuild to rolldown. |
| packages/arkor/src/core/rolldownConfig.ts | Centralizes Rolldown input options + entry/outDir resolution and Node target derivation. |
| packages/arkor/src/cli/commands/build.ts | Implements arkor build via Rolldown bundling instead of esbuild. |
| packages/arkor/src/studio/hmr.ts | Adds lazy Rolldown watcher coordinator that emits ready/rebuild/error events. |
| packages/arkor/src/studio/hmr.test.ts | Tests HMR coordinator behavior (ready/rebuild/error/replay/dispose). |
| packages/arkor/src/studio/manifest.ts | Adds configHash to manifest summary and splits “summarise built manifest” vs “build + summarise”. |
| packages/arkor/src/studio/trainRegistry.ts | Adds per-child registry + policy for SIGUSR2 hot-swap vs SIGTERM early-stop. |
| packages/arkor/src/studio/trainRegistry.test.ts | Tests TrainRegistry signaling decisions and error tolerance. |
| packages/arkor/src/studio/server.ts | Adds /api/dev/events SSE route and integrates TrainRegistry into rebuild handling. |
| packages/arkor/src/studio/server.test.ts | Adds tests for /api/dev/events token rules, loopback guard, and SSE framing. |
| packages/arkor/src/core/configHash.ts | Adds stable hashing of JobConfig for “callbacks-only vs restart” decisions. |
| packages/arkor/src/core/configHash.test.ts | Tests determinism and sensitivity of hashJobConfig. |
| packages/arkor/src/core/trainerInspection.ts | Introduces internal inspection + callback replacement + early-stop brands via Symbol.for. |
| packages/arkor/src/core/trainer.ts | Implements callback hot-swap support + early-stop latch; attaches internal brands. |
| packages/arkor/src/core/trainer.test.ts | Adds coverage for early-stop and callback hot-swap behavior via internal brands. |
| packages/arkor/src/core/runnerSignals.ts | Adds signal handlers: SIGTERM graceful early-stop + SIGUSR2 callback reload. |
| packages/arkor/src/core/runnerSignals.test.ts | Tests SIGTERM/SIGUSR2 handler behaviors with branded trainers. |
| packages/arkor/src/core/runner.ts | Installs/removes new signal handlers around trainer.start()/trainer.wait(). |
| packages/arkor/src/core/runner.test.ts | Adds a test for SIGTERM early-stop behavior in the runner. |
| packages/arkor/src/cli/cleanupHooks.ts | Adds reusable cleanup-hook registration for exit/signals. |
| packages/arkor/src/cli/commands/dev.ts | Wires HMR coordinator into Studio server and refactors shutdown cleanup via cleanup hooks. |
| packages/arkor/src/cli/commands/start.test.ts | Updates test comment to refer to Rolldown instead of esbuild. |
| packages/studio-app/src/lib/api.ts | Adds typed DevEvent + openDevEvents() for SSE notifications. |
| packages/studio-app/src/components/RunTraining.tsx | SPA listens to dev SSE events, refreshes manifest, and coordinates restart/hot-swap UI state. |
| docs/concepts/studio.mdx | Documents HMR workflow, including hash-based hot-swap vs restart behavior (EN). |
| docs/ja/concepts/studio.mdx | Documents HMR workflow (JA). |
| AGENTS.md | Updates repository policy/docs for new SSE allow-list, HMR internals, and Rolldown build pipeline. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const [log, setLog] = useState(""); | ||
| const [manifest, setManifest] = useState<ManifestResult | null>(null); | ||
| const [hmrStatus, setHmrStatus] = useState< | ||
| "idle" | "rebuilding" | "early-stopping" | "restarting" | "hot-swapped" |
| import { ensureProjectState } from "../core/projectState"; | ||
| import { readState } from "../core/state"; | ||
| import { readManifestSummary } from "./manifest"; | ||
| import { readManifestSummary, summariseBuiltManifest } from "./manifest"; |
| function scheduleHmrCleanup(hmr: { dispose: () => Promise<void> }): void { | ||
| // Registered before the studio-token cleanup so it runs first on | ||
| // shutdown — Node fires signal handlers in registration order, and we | ||
| // want the watcher to release file handles before the outermost | ||
| // process.exit. | ||
| registerCleanupHook({ cleanup: () => hmr.dispose() }); | ||
| } |
| requestEarlyStopOnMismatch( | ||
| nextConfigHash: string | null, | ||
| ): RestartTarget[] { | ||
| const targets: RestartTarget[] = []; | ||
| for (const [pid, entry] of this.entries) { | ||
| if ( | ||
| nextConfigHash === null || | ||
| entry.configHash === null || | ||
| entry.configHash !== nextConfigHash | ||
| ) { | ||
| try { | ||
| entry.child.kill("SIGTERM"); | ||
| } catch { | ||
| // child already exited; close handler will clean up. | ||
| } | ||
| targets.push({ pid, trainFile: entry.trainFile }); | ||
| } |
| await callbacks.onCheckpoint?.(ctx); | ||
| // Early-stop latch: a checkpoint just landed, so the in-flight work | ||
| // is durable. Cancel the cloud job and end `wait()` cleanly. | ||
| if (earlyStopRequested && earlyStopDeferred) { | ||
| await trainer.cancel(); | ||
| if (earlyStopDeferred.timer) clearTimeout(earlyStopDeferred.timer); | ||
| earlyStopDeferred.resolve(); | ||
| earlyStopDeferred = null; | ||
| return { terminal: true, artifacts: terminalResult?.artifacts ?? [] }; | ||
| } |
| function startWatcher(): void { | ||
| if (watcher || disposed) return; | ||
| if (!existsSync(resolved.entry)) { | ||
| broadcast({ | ||
| type: "error", | ||
| message: `Build entry not found: ${resolved.entry}. Create ${BUILD_DEFAULTS.entry} or pass an explicit entry argument.`, | ||
| }); | ||
| return; | ||
| } |
This pull request migrates the Arkor build and development workflow from esbuild to Rolldown, introduces a robust hot module replacement (HMR) system for the Studio dev server, and refines the cleanup and shutdown logic for long-lived resources. Documentation has been updated to explain the new HMR behavior in both English and Japanese, and dependency management has been adjusted accordingly. These changes streamline the developer experience, enable live code updates during training runs, and improve resource handling during shutdown.
Build system migration and HMR integration:
arkor build, ensuring bundles target the current Node version and keeping external dependencies unbundled. (packages/arkor/package.json,packages/arkor/src/cli/commands/build.ts,packages/arkor/src/core/rolldownConfig) [1] [2] [3]src/arkor/in dev mode, enabling HMR for trainer code. The watcher notifies the SPA via SSE (/api/dev/events), and the dev server can hot-swap callbacks or gracefully early-stop training runs based on config changes. (packages/arkor/src/cli/commands/dev.ts,packages/arkor/src/studio/hmr.ts,packages/arkor/src/studio/trainRegistry.ts) [1] [2] [3] [4] [5] [6]Cleanup and shutdown improvements:
registerCleanupHookutility to handle resource cleanup on process exit and signals, ensuring proper teardown order (e.g., HMR watcher before token file removal). (packages/arkor/src/cli/cleanupHooks.ts,packages/arkor/src/cli/commands/dev.ts) [1] [2]Documentation updates:
docs/concepts/studio.mdx,docs/ja/concepts/studio.mdx,AGENTS.md) [1] [2] [3]AGENTS.md)Dependency and test adjustments:
packages/arkor/package.json,packages/arkor/src/cli/commands/start.test.ts) [1] [2]Security and internal API notes:
Symbol.forbrands, not exposed on the public SDK surface. (AGENTS.md) [1] [2]