feat: add dry-run mode for execute (Phase 3) by acmacalister · Pull Request #140 · daltoniam/switchboard

acmacalister · 2026-04-22T22:48:42Z

Summary

dry_run flag on execute: Pass dry_run: true to validate arguments and check integration health without executing. LLMs can preview mutation operations (create issue, merge PR, post comment) before committing.
Tiered approach: Integrations implementing the new DryRunIntegration interface return native previews. All others get a simulated dry-run that validates args, checks the circuit breaker, and confirms health.
No side effects: Dry-run never calls Execute, never pins results, never records breadcrumbs — purely a validation + preview path.

This is Phase 3 of the stateful features plan. Independent of Phases 1-2 but most useful alongside session context.

Test plan

8 tests: simulated fallback, validation failure, native integration, native-declines-falls-back, unhealthy integration, tool not found, does-not-execute, no-pinning
All tests pass with -race
golangci-lint run — 0 issues
Full go test ./... — all existing tests still pass

🐘 Generated with Crush

acmacalister

Good shape overall — clean architecture, DryRunIntegration as an optional interface is the right extension point, and the test coverage is thorough for most paths. Two things need attention before merge: a logic inversion in the circuit-breaker path, and a question about whether dry-run should respect session context the way real execute does.

acmacalister

Good shape overall — the tiered DryRunIntegration interface is clean and the fix commit resolved the two issues from the previous round. One remaining circuit-breaker edge case and a minor silent-ignore to look at inline.

acmacalister

Excellent iteration — all five previous round comments are addressed and the new tests lock down the edge cases well. Two small nits inline, neither blocking.

acmacalister · 2026-04-24T03:56:17Z

+	assert.Equal(t, "github_create_issue", resp["tool"])
+	assert.Equal(t, "github", resp["integration"])
+	assert.Equal(t, "ok", resp["status"])
+	args := resp["validated_args"].(map[string]any)


Direct type assertion on resp["validated_args"] will panic (not fail) if the field is absent or wrong type — making a future regression harder to diagnose. The same pattern appears again at line 353.

validatedArgs, ok := resp["validated_args"].(map[string]any) require.True(t, ok, "expected validated_args to be a map") assert.Equal(t, "daltoniam", validatedArgs["owner"])

acmacalister · 2026-04-24T03:56:17Z

+// returns (result, true), that result is used directly. Otherwise the server
+// falls back to a simulated dry-run (arg validation + preview).
+type DryRunIntegration interface {
+	DryRun(ctx context.Context, toolName ToolName, args map[string]any) (*ToolResult, bool)


MarkdownIntegration.RenderMarkdown has a method-level doc explaining its return contract; DryRun doesn't. Worth adding a note that (nil, false) is the expected "decline" signal (not an error), so adapter authors don't reach for (*ToolResult{IsError: true}, true) when they just want to fall back:

// DryRun returns a native preview for the given tool call, or (nil, false) // if this tool doesn't support native dry-run and the server should fall back // to a simulated preview. Returning (nil, false) is not an error. DryRun(ctx context.Context, toolName ToolName, args map[string]any) (*ToolResult, bool)

daltoniam · 2026-04-24T15:15:40Z

 	}

+	if args.DryRun && args.Script != "" {
+		return errorResult("dry_run is not supported with script — use tool_name"), nil


not being about to use script is a bit of bummer here, understandable why, but does suck for exercising maximum token use reduction

acmacalister

Solid iteration — previous round of comments are all addressed. Found two more small things inline.

acmacalister · 2026-04-24T15:31:57Z

+			integration.Name(),
+		)), nil
+	}
+	cb.recordSuccess()


recordSuccess() here heals the circuit breaker based on Healthy() returning true — but Healthy() is typically a lightweight credential/ping check, not a full API round-trip. In executeTool, recordSuccess() only fires after a real Execute() call succeeds. The mismatch means a dry-run health ping can close a half-open breaker and let the next real call through against an endpoint that's still broken.

One option: skip recordSuccess() in the dry-run path and let the real execute do the healing:

if !integration.Healthy(ctx) { cb.recordFailure() return errorResult(fmt.Sprintf( "dry-run: integration %q is unhealthy — call would likely fail", integration.Name(), )), nil } // Don't call cb.recordSuccess() here — only real Execute calls should heal the breaker.

If healing on a health check is intentional ("dry-run acts as a probe"), worth a comment saying so.

acmacalister · 2026-04-24T15:31:57Z

+		},
+		dryRunFn: func(_ context.Context, _ mcp.ToolName, args map[string]any) (*mcp.ToolResult, bool) {
+			return &mcp.ToolResult{
+				Data: `{"dry_run":true,"native":true,"would_invoke":"` + args["function_name"].(string) + `"}`,


Unsafe type assertion here — if function_name is absent from args or the wrong type, this panics rather than failing the test cleanly. The validated_args assertions were updated to use the two-value form, worth doing the same here:

fn, ok := args["function_name"].(string) require.True(t, ok, "expected function_name to be a string") return &mcp.ToolResult{ Data: `{"dry_run":true,"native":true,"would_invoke":"` + fn + `"}`, }, true

Add a dry_run flag to the execute meta-tool that validates arguments and checks integration health without actually performing the operation. This makes LLMs more willing to call mutation tools (create, update, delete) since they can preview what would happen first. Tiered approach: - Native: integrations implementing DryRunIntegration can return a custom preview (e.g. AWS Lambda DryRun invocation type) - Simulated fallback: validates args, checks circuit breaker and health, returns a preview with validated_args and status - The tool is never actually executed in either path New DryRunIntegration optional interface in mcp.go — integrations can adopt it incrementally. The execute tool schema now includes a dry_run boolean parameter. 8 new tests: simulated fallback, validation failure, native integration, native-declines-falls-back, unhealthy integration, tool not found, does-not-execute verification, no-pinning verification. 🐘 Generated with Crush Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

acmacalister

Build, tests, and lint all pass — the dry-run design is solid. Two things from earlier review rounds look like they were lost in the squash; flagging them inline.

acmacalister · 2026-04-24T18:48:42Z

 	}

+	if args.DryRun {
+		return s.handleDryRun(ctx, args.ToolName, args.Arguments)


The args.Script check above this returns early before dry_run is ever evaluated — so { script: "api.call('github_delete_repo', ...)", dry_run: true } executes the script for real. The tool description says "Works with tool_name only (not scripts)" but there's no enforcement.

This guard was in an intermediate commit but appears to have been dropped in the squash:

if args.DryRun && args.Script != "" { return errorResult("dry_run is not supported with script — use tool_name"), nil }

acmacalister · 2026-04-24T18:48:42Z

+
+	cb := s.getBreaker(integration.Name())
+	if !cb.allow() {
+		cb.recordSuccess()


This cb.recordSuccess() fires on the rejection path — allow() returned false, meaning the breaker is open. Resetting failures to zero here heals the breaker mid-rejection, so the next call passes through as if nothing happened.

This was flagged in the first review round but looks like it was lost in the squash. The fix is to remove this call and only keep the one at line 29:

if !cb.allow() { return errorResult(fmt.Sprintf( "dry-run: integration %q temporarily unavailable (circuit breaker open)", integration.Name(), )), nil } cb.recordSuccess()

acmacalister commented Apr 22, 2026

View reviewed changes

Comment thread server/dryrun.go

Comment thread server/dryrun.go

Comment thread server/dryrun_test.go

acmacalister commented Apr 24, 2026

View reviewed changes

Comment thread server/dryrun.go

Comment thread server/server.go Outdated

acmacalister commented Apr 24, 2026

View reviewed changes

daltoniam approved these changes Apr 24, 2026

View reviewed changes

acmacalister commented Apr 24, 2026

View reviewed changes

acmacalister force-pushed the dry-run-mode branch from f2ad0d5 to 6b1c06f Compare April 24, 2026 18:29

acmacalister commented Apr 24, 2026

View reviewed changes

Conversation

acmacalister commented Apr 22, 2026

Summary

Test plan

Uh oh!

acmacalister left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

acmacalister left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

acmacalister left a comment

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

daltoniam Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

acmacalister left a comment

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

acmacalister left a comment

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

acmacalister Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants