Skip to content

fix: explicitly exit process on successful command completion#8032

Draft
tobq wants to merge 3 commits into
netlify:mainfrom
tobq:fix/process-exit-on-success
Draft

fix: explicitly exit process on successful command completion#8032
tobq wants to merge 3 commits into
netlify:mainfrom
tobq:fix/process-exit-on-success

Conversation

@tobq

@tobq tobq commented Mar 13, 2026

Copy link
Copy Markdown

Summary

Fixes #8031

On success, onEnd() relied on the Node.js event loop draining naturally to exit the process. This fails when stdio pipes are closed by the parent process (e.g. CI runners, editor integrations, or tools that spawn the CLI with piped stdio and then close the connection).

On Windows specifically, the orphaned process never exits and continuously leaks OS handles, eventually accumulating 400,000+ handles per process. This exhausts the Windows kernel nonpaged pool (~3.4 GB vs normal ~500 MB), causing full system freezes including mouse/keyboard input.

Root cause

onEnd() calls exit(1) on error but does nothing on success — it just returns, expecting the event loop to drain. However:

  1. Node.js 19+ default HTTP agents use keepAlive: true, keeping sockets in the pool
  2. When parent processes close stdio pipes (normal behavior for CI, editors, etc.), the CLI process becomes orphaned on Windows
  3. The orphaned process can't exit because it has active handles (keepAlive sockets, broken pipe references)
  4. Handles accumulate indefinitely until the system runs out of kernel resources

Changes

  • src/commands/base-command.ts: Add exit() call at the end of onEnd() for the success path (the error path already had exit(1))
  • bin/run.js: await the onEnd() call in both success and error paths so telemetry tracking completes before exit

Reproduction

The issue can be reproduced by spawning the CLI with piped stdio, then closing the pipes:

const child = spawn('node', ['bin/run.js', 'api', 'listSiteDeploys', '--data', '...'], { stdio: 'pipe' });
child.stdout.on('data', () => {});
// After API response, close pipes (simulating parent abandonment)
setTimeout(() => {
  child.stdout.destroy();
  child.stderr.destroy();
  child.stdin.destroy();
  child.unref();
  // Without fix: process stays alive indefinitely, leaking handles
  // With fix: process exits cleanly via process.exit(0)
}, 1500);

Before fix: Process stays alive indefinitely after pipe closure, accumulating handles
After fix: Process exits cleanly via process.exit(0) before pipes are even abandoned

On success, onEnd() relied on the Node.js event loop draining naturally
to exit the process. This fails when stdio pipes are closed by the parent
(e.g. CI runners, editor integrations, or tools that spawn the CLI with
piped output and then close the connection). On Windows in particular,
the orphaned process never exits and continuously leaks OS handles,
eventually exhausting the kernel nonpaged pool and freezing the system.

The fix:
- Add exit() call in onEnd() for the success path (error path already
  had exit(1))
- Await onEnd() in bin/run.js so telemetry completes before exit

Fixes netlify#8031
@tobq tobq requested a review from a team as a code owner March 13, 2026 02:45
@coderabbitai

coderabbitai Bot commented Mar 13, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6422bad2-6621-4bcc-a856-879fd86dacce

📥 Commits

Reviewing files that changed from the base of the PR and between d87ca13 and f5bd1bb.

📒 Files selected for processing (2)
  • bin/run.js
  • src/commands/base-command.ts
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • netlify/blueprints (manual)
🚧 Files skipped from review as they are similar to previous changes (2)
  • bin/run.js
  • src/commands/base-command.ts

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes
    • Improved shutdown reliability by awaiting end-of-run async work in both success and error scenarios, so cleanup/telemetry tasks complete before the process finishes.
    • Ensures consistent termination after final telemetry handling by explicitly exiting after telemetry completes, reducing hangs and inconsistent exit behavior.

Walkthrough

The PR changes process shutdown to wait for onEnd handlers and to always terminate. In bin/run.js, two calls to program.onEnd() are awaited (await program.onEnd() and await program.onEnd(error)) so control waits for their resolution. In src/commands/base-command.ts, BaseCommand.onEnd() now calls exit() unconditionally after telemetry handling, ensuring the process exits on both success and error paths.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: explicitly exiting after successful command completion.
Description check ✅ Passed The description matches the changeset and explains the exit-on-success fix and awaited cleanup work.
Linked Issues check ✅ Passed The changes satisfy #8031 by exiting on success and awaiting onEnd so telemetry finishes before termination.
Out of Scope Changes check ✅ Passed No unrelated or out-of-scope code changes are evident in the described diff.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed. For unrecoverable errors, disable the tool in CodeRabbit configuration.

Warning

Review ran into problems

🔥 Problems

Linked repositories: Couldn't analyze netlify/blueprints - clone failed: Clone operation failed: Cloning into '/home/jailuser/git'...
warning: templates not found in /usr/share/git-core/templates
fatal: unable to access 'https://github.com/netlify/blueprints.git/': Failed to connect to github.com port 443 after 135129 ms: Couldn't connect to server


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/commands/base-command.ts (1)

436-441: ⚠️ Potential issue | 🟡 Minor

Use else branch to prevent exit code fallthrough when process.exit is mocked.

The onEnd method at line 414 can fall through to the unconditional exit() at line 441 when process.exit is mocked to return (as in tests). When an error occurs and exit(1) is called at line 438, a mocked exit returns control flow, allowing line 441 to execute exit() with code 0, masking the error status.

🔧 Proposed fix
     if (error_ !== undefined) {
       logError(error_ instanceof Error ? error_ : format(error_))
       exit(1)
     }
-
-    exit()
+    else {
+      exit()
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/base-command.ts` around lines 436 - 441, The onEnd method
currently calls exit(1) when error_ is set but then unconditionally calls exit()
afterwards, which can mask failure if process.exit is mocked; update the control
flow in onEnd (referencing error_, logError(...) and exit(...)) so that the
unconditional exit() is only called in the else branch when error_ is
undefined—i.e., after logging the error and calling exit(1) do not fall through
to the success exit path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/commands/base-command.ts`:
- Around line 436-441: The onEnd method currently calls exit(1) when error_ is
set but then unconditionally calls exit() afterwards, which can mask failure if
process.exit is mocked; update the control flow in onEnd (referencing error_,
logError(...) and exit(...)) so that the unconditional exit() is only called in
the else branch when error_ is undefined—i.e., after logging the error and
calling exit(1) do not fall through to the success exit path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4af0590c-9fcd-4fc7-838c-a5728cf05d00

📥 Commits

Reviewing files that changed from the base of the PR and between 8dae2c5 and bcb9cf9.

📒 Files selected for processing (2)
  • bin/run.js
  • src/commands/base-command.ts

@aitchiss

Copy link
Copy Markdown

@tobq sorry about the delay in response here - it looks like you have some failing integration tests - I'm going to pop this in draft for now - but please re-open the PR when you've had a chance to take a look & fix 🙏

@aitchiss aitchiss marked this pull request as draft June 26, 2026 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

netlify api listSiteDeploys leaks hundreds of thousands of OS handles on Windows, eventually freezing the system

3 participants