Skip to content

fix: reuse existing ssh agent instead of starting a new one#55

Merged
MrSquaare merged 13 commits into
MrSquaare:mainfrom
andiradulescu:fix/reuse-existing-ssh-agent
Mar 7, 2026
Merged

fix: reuse existing ssh agent instead of starting a new one#55
MrSquaare merged 13 commits into
MrSquaare:mainfrom
andiradulescu:fix/reuse-existing-ssh-agent

Conversation

@andiradulescu
Copy link
Copy Markdown
Contributor

@andiradulescu andiradulescu commented Feb 24, 2026

Pull Request

Related issue(s)

Resolves #54

Description

Reuse an existing SSH agent when SSH_AUTH_SOCK and SSH_AGENT_PID are already set and the agent is responsive, instead of unconditionally starting a new one.

Summary by CodeRabbit

  • Bug Fixes

    • Reuses an already-running SSH agent to avoid redundant startups.
    • Skips stopping SSH agents the tool didn't start and emits clearer warnings on shutdown failures.
  • Tests

    • Adds a comprehensive SSH test suite and richer test helpers to validate cloning, multiple keys, stale-agent reuse, and failure cases.
  • Chores

    • Version bumped and minor repo housekeeping (ignore entries, CI workflow adjustments).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 24, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f68cf746-2eec-433c-80d0-7f3ad6f91096

📥 Commits

Reviewing files that changed from the base of the PR and between a648e93 and 9744fda.

📒 Files selected for processing (16)
  • .github/workflows/pull-request.yml
  • .github/workflows/tests.yml
  • .gitignore
  • CONTRIBUTING.md
  • README.md
  • lib/cleanup.js
  • package.json
  • test/Dockerfile.git-server
  • test/git-server.sh
  • test/helpers.sh
  • test/missing-private-key-test.sh
  • test/multiple-reuse-stale-test.sh
  • test/multiple-test.sh
  • test/no-action-test.sh
  • test/single-test.sh
  • test/test.sh

Walkthrough

Adds reuse-check for an existing SSH agent: when SSH_AUTH_SOCK/SSH_AGENT_PID exist, runs ssh-add -l and, if agent is responsive, saves started-agent="false" and returns existing socket/PID; otherwise starts a new agent and saves started-agent="true". Cleanup skips stopping when state is "false".

Changes

Cohort / File(s) Summary
SSH agent core
src/index.ts, lib/index.js
Add early-return: when SSH_AUTH_SOCK and SSH_AGENT_PID exist, run ssh-add -l; if exit code ≠ 2 treat agent as alive, save started-agent="false" and reuse existing agent; if exit code == 2 or env missing, start a new ssh-agent and save started-agent="true". saveState(name,value) helper added and exported in lib/index.js.
SSH agent cleanup
src/cleanup.ts
Read started-agent from Actions state and skip stopping agent when "false". Preserve try/catch around stop logic; on non-Error caught values still emit a warning and log errors appropriately.
Tests & test harness
test/*, test/helpers.sh, test/*.sh, test/Dockerfile.git-server, test/git-server.sh
Add a comprehensive test harness: multiple new test scripts (single/multiple/no-action/missing-private-key/multiple-reuse-stale), revamped test/helpers.sh with many assertion/util helpers, Dockerfile and git-server runner script to host SSH git server, and removal of old test/test.sh.
CI workflows
.github/workflows/pull-request.yml, .github/workflows/tests.yml
Remove the previous inline Test job from pull-request CI; add a new dedicated Tests workflow with multiple jobs validating SSH scenarios (single/multiple/reuse/stale/failure cases) that exercise the new test scripts.
Package & metadata
package.json, manifest_file
Bump package version (3.1.2 → 3.1.3) and remove test script from package.json; minor manifest edits.
Docs & config
CONTRIBUTING.md, README.md, .coderabbit.yaml, .gitignore
Update contributing steps to use pnpm check, switch one example secret in README, add .coderabbit.yaml, and add .pnpm-store/ and .DS_Store to .gitignore.

Sequence Diagram(s)

sequenceDiagram
  participant Action as Action (initializeSSHAgent)
  participant Env as Environment (SSH_AUTH_SOCK / SSH_AGENT_PID)
  participant SSHAdd as ssh-add
  participant Agent as ssh-agent
  rect rgba(200,200,255,0.5)
  Action->>Env: check SSH_AUTH_SOCK & SSH_AGENT_PID
  end
  alt env vars present
    Action->>SSHAdd: run `ssh-add -l`
    SSHAdd-->>Action: exit code
    alt exit code != 2 (agent alive)
      Action->>Action: saveState("started-agent","false")
      Action-->>Agent: reuse existing agent (no start)
    else exit code == 2 (not alive)
      Action->>Agent: start ssh-agent
      Agent-->>Action: return PID & socket
      Action->>Action: saveState("started-agent","true")
    end
  else env vars missing
    Action->>Agent: start ssh-agent
    Agent-->>Action: return PID & socket
    Action->>Action: saveState("started-agent","true")
  end
Loading
sequenceDiagram
  participant Cleanup as Action (cleanup)
  participant State as GitHub Actions State
  participant Agent as ssh-agent
  Cleanup->>State: getState("started-agent")
  alt state == "false"
    Cleanup-->>Cleanup: log skip stopping agent
  else
    Cleanup->>Agent: attempt to stop ssh-agent
    Agent-->>Cleanup: success / error
    alt error
      Cleanup-->>Cleanup: log error message if present
      Cleanup-->>Cleanup: warn "Error stopping the SSH agent, proceeding anyway"
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

bug, build

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The addition of .coderabbit.yaml configuration file is unrelated to the SSH agent reuse functionality and issue #54 resolution, introducing out-of-scope infrastructure changes. Remove the .coderabbit.yaml file or justify why this CodeRabbit configuration is necessary as part of this specific issue fix.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: reuse existing ssh agent instead of starting a new one' accurately and concisely describes the main code change across all modified files.
Description check ✅ Passed The pull request description follows the repository template with a Related issue section referencing issue #54 and a Description section explaining the intent to reuse existing SSH agents.
Linked Issues check ✅ Passed The code changes directly address issue #54 by detecting existing, responsive SSH agents via ssh-add and reusing them instead of starting new ones, preventing key loss across multiple invocations [#54].

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/index.ts (1)

115-126: Prefer explicit exit code check (0 or 1) over negation.

The OpenSSH documentation for ssh-add -l specifies only three exit codes: 0 (success), 1 (command fails), and 2 (unable to contact agent). The current code !== 2 check is functionally correct for these documented cases, but an explicit allow-list improves clarity and provides better defense against edge cases or future changes.

🔧 Suggested improvement
-    if (code !== 2) {
-      return { pid: existingPid, socket: existingSocket };
-    }
+    if (code === 0 || code === 1) {
+      return { pid: existingPid, socket: existingSocket };
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 115 - 126, The current initializeSSHAgent function
uses a negation check (code !== 2) after calling execute("ssh-add", ["-l"]);
replace that with an explicit allow-list: check if code === 0 || code === 1 to
determine the agent is alive. Update the conditional in initializeSSHAgent to
use the explicit numeric checks (using the returned code from execute) and
return the existing PID/socket when code is 0 or 1; keep the rest of the logic
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/index.ts`:
- Around line 115-126: The current initializeSSHAgent function uses a negation
check (code !== 2) after calling execute("ssh-add", ["-l"]); replace that with
an explicit allow-list: check if code === 0 || code === 1 to determine the agent
is alive. Update the conditional in initializeSSHAgent to use the explicit
numeric checks (using the returned code from execute) and return the existing
PID/socket when code is 0 or 1; keep the rest of the logic unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c2bf61 and 51fccfe.

📒 Files selected for processing (2)
  • lib/index.js
  • src/index.ts

@andiradulescu
Copy link
Copy Markdown
Contributor Author

Re: the code !== 2 nitpick — intentionally using negation rather than an allow-list. If ssh-add returns an unexpected code (signal-based exit, platform quirk), the negation safely falls through to starting a new agent, whereas === 0 || === 1 would discard a potentially working agent.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/index.ts (1)

130-130: saveState("started-agent", "true") is set before the agent is confirmed running

If execute("ssh-agent", ["-s"]) returns a non-zero code (line 133 check), the function throws — but the state is already "true". The cleanup step will then attempt ssh-agent -k against an agent that was never started. In practice the catch block in cleanup.ts swallows the error gracefully, but the state is semantically incorrect at that point.

Consider moving the saveState call to after confirming code === 0:

♻️ Proposed fix
-  core.saveState("started-agent", "true");
   const { code, stdout, stderr } = await execute("ssh-agent", ["-s"]);

   if (code !== 0) {
     console.error("Failed to initialize ssh agent:", stderr);
     throw new Error("Failed to initialize ssh agent");
   }

+  core.saveState("started-agent", "true");
   const pidMatch: RegExpMatchArray | null = stdout.match(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` at line 130, Move the call to core.saveState("started-agent",
"true") so it only runs after successfully starting the agent: check the result
of execute("ssh-agent", ["-s"]) (the variable checked for code === 0) and only
then call core.saveState; if execute returns a non-zero code do not persist the
"started-agent" state so cleanup (which calls ssh-agent -k) won't attempt to
kill a non-existent agent. Update the logic around the execute("ssh-agent",
["-s"]) call and the surrounding try/catch in src/index.ts to set the state
post-success.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cleanup.ts`:
- Around line 12-19: The catch block that handles stopping the SSH agent
currently logs both a detailed warning when error instanceof Error and an
unconditional generic warning unconditionally, causing duplicate messages;
change the catch so the second console.warn is moved into an else branch (i.e.,
if (error instanceof Error) { console.warn("Error stopping the SSH agent,
proceeding anyway:", error.message); } else { console.warn("Error stopping the
SSH agent, proceeding anyway"); }) so only one message is emitted for the same
error, referencing the catch block and the error variable and the two
console.warn calls.

---

Nitpick comments:
In `@src/index.ts`:
- Line 130: Move the call to core.saveState("started-agent", "true") so it only
runs after successfully starting the agent: check the result of
execute("ssh-agent", ["-s"]) (the variable checked for code === 0) and only then
call core.saveState; if execute returns a non-zero code do not persist the
"started-agent" state so cleanup (which calls ssh-agent -k) won't attempt to
kill a non-existent agent. Update the logic around the execute("ssh-agent",
["-s"]) call and the surrounding try/catch in src/index.ts to set the state
post-success.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51fccfe and 2716ca9.

📒 Files selected for processing (4)
  • lib/cleanup.js
  • lib/index.js
  • src/cleanup.ts
  • src/index.ts

Comment thread src/cleanup.ts Outdated
@andiradulescu andiradulescu force-pushed the fix/reuse-existing-ssh-agent branch from 2716ca9 to 88cff17 Compare February 24, 2026 17:03
@andiradulescu andiradulescu force-pushed the fix/reuse-existing-ssh-agent branch from 88cff17 to 8f577db Compare February 24, 2026 17:09
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/index.ts (1)

130-136: ⚠️ Potential issue | 🟡 Minor

saveState("started-agent", "true") is set before the agent is confirmed to have started.

If execute("ssh-agent", ["-s"]) returns a non-zero exit code (line 133), the function throws — but the state is already "true". Cleanup then attempts ssh-agent -k against an agent that was never launched, producing a spurious warning. Move saveState to after the guard:

🐛 Proposed fix
-  core.saveState("started-agent", "true");
   const { code, stdout, stderr } = await execute("ssh-agent", ["-s"]);

   if (code !== 0) {
     console.error("Failed to initialize ssh agent:", stderr);
     throw new Error("Failed to initialize ssh agent");
   }

+  core.saveState("started-agent", "true");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 130 - 136, Move the
core.saveState("started-agent", "true") call so it only runs after ssh-agent was
successfully started: call execute("ssh-agent", ["-s"]) and check the returned
code/outputs first (the variables code, stdout, stderr from execute), throw on
non-zero as now, and only after confirming success set
core.saveState("started-agent", "true"); this avoids marking state when
ssh-agent failed and prevents running ssh-agent -k against a non-existent agent.
♻️ Duplicate comments (1)
src/cleanup.ts (1)

12-19: Duplicate console.warn on line 19 is still unresolved.

The unconditional console.warn at line 19 fires alongside the instanceof Error branch, producing two warnings for the same error. The previously proposed fix (move it into an else) still applies.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cleanup.ts` around lines 12 - 19, The catch block in src/cleanup.ts logs
the same message twice; inside the catch where you check "if (error instanceof
Error)" keep the console.warn that includes error.message and remove or move the
unconditional console.warn into the else branch so it only runs when error is
not an Error instance—update the error handling in that catch (the block
referencing error and the two console.warn calls) to ensure a single warning is
emitted with error details when available and a fallback message otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/index.ts`:
- Around line 130-136: Move the core.saveState("started-agent", "true") call so
it only runs after ssh-agent was successfully started: call execute("ssh-agent",
["-s"]) and check the returned code/outputs first (the variables code, stdout,
stderr from execute), throw on non-zero as now, and only after confirming
success set core.saveState("started-agent", "true"); this avoids marking state
when ssh-agent failed and prevents running ssh-agent -k against a non-existent
agent.

---

Duplicate comments:
In `@src/cleanup.ts`:
- Around line 12-19: The catch block in src/cleanup.ts logs the same message
twice; inside the catch where you check "if (error instanceof Error)" keep the
console.warn that includes error.message and remove or move the unconditional
console.warn into the else branch so it only runs when error is not an Error
instance—update the error handling in that catch (the block referencing error
and the two console.warn calls) to ensure a single warning is emitted with error
details when available and a fallback message otherwise.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2716ca9 and 8f577db.

📒 Files selected for processing (4)
  • lib/cleanup.js
  • lib/index.js
  • src/cleanup.ts
  • src/index.ts

@andiradulescu
Copy link
Copy Markdown
Contributor Author

Not sure if I should've pushed lib/index.js and lib/cleanup.js. Let me know if I should remove them.

@MrSquaare
Copy link
Copy Markdown
Owner

MrSquaare commented Mar 6, 2026

Hello, and sorry for the delay

First, thanks for the PR!

Yes, the lib/index.js and lib/cleanup.js files must be rebuilt and pushed, as they are used by the action.

I took the opportunity to migrate bundle tools, so I will merge main into this branch and rebuild, then launch the workflows.

I will also update the tests to ensure that the action uses the initialized SSH agent if available.

I'm still wondering if such a change could break workflows. This could happen if users expect to have a new SSH agent each time the action is invoked. But this behavior is not documented.

NB: Test jobs will fail as repository secrets are not exposed to forks

…h-agent

Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/index.js (1)

15025-15030: ⚠️ Potential issue | 🟠 Major

Persist started-agent=true only after startup succeeds.

Right now the flag is written before ssh-agent -s succeeds. If startup fails after stale SSH_AGENT_PID / SSH_AUTH_SOCK were already present in the environment, the post step will think this action started the agent and can try to clean up the stale PID instead.

💡 Suggested fix
-	saveState("started-agent", "true");
 	const { code, stdout, stderr } = await execute("ssh-agent", ["-s"]);
 	if (code !== 0) {
 		console.error("Failed to initialize ssh agent:", stderr);
 		throw new Error("Failed to initialize ssh agent");
 	}
 	const pidMatch = stdout.match("SSH_AGENT_PID=(.*?);");
 	const socketMatch = stdout.match("SSH_AUTH_SOCK=(.*?);");
+	saveState("started-agent", "true");
 	return {
 		pid: pidMatch ? pidMatch[1] : null,
 		socket: socketMatch ? socketMatch[1] : null
 	};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/index.js` around lines 15025 - 15030, Move the call to
saveState("started-agent", "true") so it only runs after ssh-agent successfully
starts: call execute("ssh-agent", ["-s"]) and check that code === 0 (and
optionally validate stdout contains SSH_AGENT_PID/SSH_AUTH_SOCK), then call
saveState("started-agent", "true"); if execute fails, log stderr and throw
without writing the state so the post-step won't treat a stale agent as ours.
Ensure identifiers referenced are execute, saveState, and the ssh-agent
invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/index.js`:
- Around line 15013-15023: The code currently adopts any ambient SSH agent when
SSH_AUTH_SOCK/SSH_AGENT_PID exist; change this so we do NOT auto-adopt an agent
that already has identities. In the block that calls execute("ssh-add", ["-l"])
(referencing existingSocket, existingPid and execute), inspect the command
output to detect whether the agent is empty (no identities) and only reuse the
ambient agent in that specific empty case; if the agent lists identities, ignore
existingSocket/existingPid and proceed to start and use a fresh agent instead
(do not return the ambient pid/socket or call saveState("started-agent","false")
in that case). Ensure the logic is implemented where the
existingSocket/existingPid check occurs so ambient agents with keys are not
trusted.

In `@src/index.ts`:
- Around line 130-131: Currently core.saveState("started-agent","true") is
called before running ssh-agent, which can mark this run as owner even if
ssh-agent fails; move the core.saveState("started-agent","true") call to after
execute("ssh-agent", ["-s"]) succeeds (e.g., when execute returns a successful
code and you have parsed/confirmed SSH_AUTH_SOCK/SSH_AGENT_PID from the stdout),
and only set the state in that success path so stale socket/pid values won't be
claimed by this run.

---

Outside diff comments:
In `@lib/index.js`:
- Around line 15025-15030: Move the call to saveState("started-agent", "true")
so it only runs after ssh-agent successfully starts: call execute("ssh-agent",
["-s"]) and check that code === 0 (and optionally validate stdout contains
SSH_AGENT_PID/SSH_AUTH_SOCK), then call saveState("started-agent", "true"); if
execute fails, log stderr and throw without writing the state so the post-step
won't treat a stale agent as ours. Ensure identifiers referenced are execute,
saveState, and the ssh-agent invocation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d2be79df-828f-4590-802d-1a611fa73b19

📥 Commits

Reviewing files that changed from the base of the PR and between 8f577db and b3af16d.

📒 Files selected for processing (3)
  • lib/cleanup.js
  • lib/index.js
  • src/index.ts

Comment thread lib/index.js
Comment thread src/index.ts
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Repository owner deleted a comment from coderabbitai Bot Mar 6, 2026
@andiradulescu
Copy link
Copy Markdown
Contributor Author

I'm still wondering if such a change could break workflows. This could happen if users expect to have a new SSH agent each time the action is invoked. But this behavior is not documented.

If an existing workflow has multiple invocations, with this change, the SSH agent simply has all the previous keys added in the current invocation.

Some more security savvy users might actually expect only the current invocation key to be available. And they ignored the README.

An option would be to add a parameter that activates this functionality e.g. “multiple: true/false”. This can have its default set to false, so nothing would change for the existing workflows.

But, from my personal point of view, I think we’re overthinking it.

Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
Signed-off-by: Guillaume Bonnet <mrsquaare@mrsquaare.fr>
@MrSquaare MrSquaare merged commit c818628 into MrSquaare:main Mar 7, 2026
8 checks passed
@MrSquaare
Copy link
Copy Markdown
Owner

Released in v3.1.3, thanks again for your contribution!

@andiradulescu andiradulescu deleted the fix/reuse-existing-ssh-agent branch March 7, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple keys setup doesn't work correctly

2 participants