Skip to content

Tentacle can detect Powershell scripts that don't start, and can abandon them.#1200

Merged
LukeButters merged 31 commits intomainfrom
luke/powershell-start-fail
Apr 9, 2026
Merged

Tentacle can detect Powershell scripts that don't start, and can abandon them.#1200
LukeButters merged 31 commits intomainfrom
luke/powershell-start-fail

Conversation

@LukeButters
Copy link
Copy Markdown
Contributor

@LukeButters LukeButters commented Mar 13, 2026

Background

When powershell.exe is invoked to run a script, it can sometimes start the process but never actually begin executing the script body. Additionally the script could never be terminated. This results in hung deployments where we may wait forever for a script that will never complete.

Refs: EFT-365
Refs: EFT-3145
Refs: HPY-1295, HPY-1296

Fixes: #1208

#investigation-stuck-tasks-blocking-deployments

Results

This PR introduces a startup detection mechanism that lets Tentacle detect and report when PowerShell never executes the script content.

Scripts that opt in include a special marker comment:

# TENTACLE-POWERSHELL-STARTUP-DETECTION-AND-GUARD-MUST-BE-AT-THE-START-OF-THE-SCRIPT

When Tentacle bootstraps the script, it replaces this marker with generated detection code that:

  1. Attempts to exclusively create a .octopus_powershell_started sentinel file
  2. Checks for a .octopus_powershell_should_run file (written by Tentacle before execution)
  3. Exits early if either check fails

Concurrently, RunningScript runs a monitoring task alongside the script execution task. If the started file isn't created within the timeout window (default: 5 minutes), the monitor concludes PowerShell never executed the script and returns exit code -47 (PowerShellNeverStartedExitCode). Additionally tentacle will ensure the power shell body is never executed.

Notes

  • Detection is opt-in — scripts without the marker comment are unaffected
  • The startup timeout is configurable via the RunningScript constructor (tests use a shorter value)
  • Currently scoped to Windows only (powershell.exe); pwsh support on Linux/Mac is possible as a follow-up

Fixes https://github.com/OctopusDeploy/OctopusTentacle/issues/... (optional public issue)

When Octopus feature toggle is enabled

Message written to server task log when PowerShell startup detection is enabled

12:46:27   Verbose  |       Executable name or full path: C:\WINDOWS\system32\WindowsPowershell\v1.0\PowerShell.exe
12:46:27   Verbose  |       Starting C:\WINDOWS\system32\WindowsPowershell\v1.0\PowerShell.exe in working directory 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf' using 'OEM United States' encoding running as 'DOMAIN\user'
12:46:27   Info     |       PowerShell startup detection: Checks passed, continuing script execution
...
12:46:27   Verbose  |       Executing 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf\Script.ps1'
...
12:46:27   Verbose  |       Invoking target script 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf\Script.ps1'.
12:46:27   Info     |       Hello

Warning written to Tentacle logs when it takes longer than the configured timeout to start powershell:

2026-03-24 12:46:05.8801   3172      1  INFO  WorkspaceCleanerTask.Start(): Starting
...
2026-03-24 12:47:46.8117   3172      5  WARN  PowerShell startup detection: PowerShell did not start within <n> minutes for task T4neCG
...
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Stop(): Stopping
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Stop(): Stopped
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Dispose(): Disposing

How to review this PR

  • PowerShellStartupDetection.cs (new) — Generates and injects the detection code; manages sentinel file paths
  • PowerShellStartupStatus.cs (new) — Enum for monitoring outcomes: NotMonitored, Started, NeverStarted
  • RunningScript.cs — Made Execute async; added concurrent startup monitoring task
  • ScriptWorkspace.cs — Injects detection code during bootstrap when the marker comment is present; creates the should_run file
  • ScriptExitCodes.cs — Added PowerShellNeverStartedExitCode = -47
  • StartScriptCommandV2.cs / ExecuteShellScriptCommand.cs — Added DurationToWaitForPowerShellToStartup parameter to allow configurable timeout
  • PowerShellStartupDetectionTests.cs (new) — Integration tests covering successful detection, failure detection, monitoring timeout, and scripts without the marker

Quality ✔️

Pre-requisites

  • I have read How we use GitHub Issues for help deciding when and where it's appropriate to make an issue.
  • I have considered informing or consulting the right people, according to the ownership map.
  • I have considered appropriate testing for my change.

@LukeButters LukeButters requested a review from hnrkndrssn March 13, 2026 01:30
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/WorkSpace/BashScriptWorkspace.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/PowerShellStartupDetection.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/RunningScript.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/RunningScript.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/RunningScript.cs Outdated
@hnrkndrssn hnrkndrssn changed the title Add support for powershell start up failire detection Add support for powershell start up failure detection Mar 31, 2026
@LukeButters LukeButters requested a review from hnrkndrssn April 1, 2026 02:07
@LukeButters LukeButters marked this pull request as ready for review April 1, 2026 02:09
@LukeButters LukeButters requested a review from a team as a code owner April 1, 2026 02:09
@LukeButters LukeButters requested a review from Copilot April 1, 2026 02:09

This comment was marked as outdated.

@LukeButters LukeButters force-pushed the luke/powershell-start-fail branch from 5a599f2 to 5bb0dd6 Compare April 1, 2026 02:45
@LukeButters LukeButters changed the title Add support for powershell start up failure detection Tentacle can detect Powershell scripts that don't start, and can abandon them. Apr 1, 2026
Copy link
Copy Markdown
Contributor

@hnrkndrssn hnrkndrssn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Copy Markdown
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

/// but silently stall before executing any script content. This was seen happening because
/// CrowdStrike prevented the script body from running.
///
/// When this happens, we get no output from the script AND the script is un-killable.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because it deviates from what we expect or can we literally not kill the powershell process?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentacle is unable to kill the script, with whatever the standard kill command dotnet uses.

We have taken dumps and seen crowdstrike is in the dump, I never saw the dump myself though.

I suspect the issue is something like powershell calls something that enters the kernel which hangs. Since it is in the kernel it can never be killed.

Copy link
Copy Markdown
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear on the cancellation behaviour.

Comment thread source/Octopus.Tentacle.Contracts/PowerShellStartupDetectionTemplateValues.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/RunningScript.cs Outdated
Comment thread source/Octopus.Tentacle.Core/Services/Scripts/RunningScript.cs
readonly IOctopusFileSystem fileSystem;
readonly IHomeDirectoryProvider home;
readonly SensitiveValueMasker sensitiveValueMasker;
readonly bool useBashWorkspace;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an enum to select a specific workspace type?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

[IntegrationTestTimeout]
public class PowerShellStartupDetectionTests : IntegrationTest
{
static (ScriptServiceV2 service, ScriptWorkspaceFactory workspaceFactory, ScriptStateStoreFactory stateStoreFactory, TemporaryDirectory tempDir) CreateScriptService(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with a nested class? It should implement IDisposable for the TemporaryDirectory.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and done.

Comment thread source/Octopus.Tentacle.Tests.Integration/PowerShellStartupDetectionTests.cs Outdated
Comment thread source/Octopus.Tentacle.Tests.Integration/PowerShellStartupDetectionTests.cs Outdated
Comment thread source/Octopus.Tentacle.Tests.Integration/PowerShellStartupDetectionTests.cs Outdated
@LukeButters LukeButters requested a review from rhysparry April 9, 2026 02:29
Copy link
Copy Markdown
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💚 Just a few minor nits. Otherwise, LGTM.

Bash,
PowerShell
}
} No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing newline at end of file


await DeletePotentiallyInUseFile(stillRunning);
await Task.Delay(TimeSpan.FromSeconds(5));
File.Exists(stillRunning).Should().BeFalse("Otherwise the script is still running and we made not effort to cancel it.");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
File.Exists(stillRunning).Should().BeFalse("Otherwise the script is still running and we made not effort to cancel it.");
File.Exists(stillRunning).Should().BeFalse("Otherwise the script is still running and we made no effort to cancel it.");

$"{shellPath} process did not start within {powerShellStartupTimeout.TotalMinutes} minutes. Script execution aborted.");

// The script has not started, and the files on disk have been arranged, so it will never meaningfully progress.
// We will now abandon the script, as we do we will cancell its cancellation token. Which will result in
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// We will now abandon the script, as we do we will cancell its cancellation token. Which will result in
// We will now abandon the script, as we do we will cancel its cancellation token. Which will result in


// The script has not started, and the files on disk have been arranged, so it will never meaningfully progress.
// We will now abandon the script, as we do we will cancell its cancellation token. Which will result in
// the script possibly dieing, although from what we have seen, the script will never die.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
// the script possibly dieing, although from what we have seen, the script will never die.
// the script possibly dying, although from what we have seen, the script will never die.

@LukeButters LukeButters enabled auto-merge (squash) April 9, 2026 05:14
@LukeButters LukeButters merged commit 9a58b85 into main Apr 9, 2026
51 checks passed
@LukeButters LukeButters deleted the luke/powershell-start-fail branch April 9, 2026 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Powershell.exe scripts sometimes never start because of Rapid7 and Crowdstrike

4 participants