Skip to content

base-action: persist execution log when SDK throws (e.g. max-turns)#1253

Open
STRd6 wants to merge 1 commit intoanthropics:mainfrom
STRd6:fix-execution-file-on-error
Open

base-action: persist execution log when SDK throws (e.g. max-turns)#1253
STRd6 wants to merge 1 commit intoanthropics:mainfrom
STRd6:fix-execution-file-on-error

Conversation

@STRd6
Copy link
Copy Markdown

@STRd6 STRd6 commented Apr 23, 2026

Problem

When the Agent SDK's query() iterator throws — most commonly on --max-turns exhaustion — the partial transcript is silently lost.

base-action/src/run-claude-sdk.ts:

try {
  for await (const message of query({ prompt, options: sdkOptions })) {
    messages.push(message);
    // ...
  }
} catch (error) {
  console.error("SDK execution error:", error);
  throw new Error(`SDK execution error: ${error}`);  // re-throws here
}

// never reached on SDK error:
try {
  await writeFile(EXECUTION_FILE, JSON.stringify(messages, null, 2));
  result.executionFile = EXECUTION_FILE;
} catch (error) { ... }

The rethrow bubbles to index.ts, whose catch calls core.setFailed and sets only conclusionexecution_file and session_id outputs are never set. So:

  • $RUNNER_TEMP/claude-execution-output.json is never written
  • steps.<id>.outputs.execution_file is empty
  • any downstream step gated on that output (e.g. an S3 upload of the transcript) silently skips
  • the partial transcript is unrecoverable

Repro

A workflow that hits --max-turns N fails with:

SDK execution error: Error: Claude Code returned an error result: Reached maximum number of turns (N)

Exit code 1, no execution_file output, no log persisted — even though 1000 turns' worth of data was in memory moments earlier.

Fix

Capture the SDK error, write the transcript, publish the outputs, then re-throw so the step still fails. Outputs are now set from inside the SDK runner because index.ts re-throws before reaching its own setOutput calls on the failure path.

On the success path the behavior is unchanged: the outputs get set in run-claude-sdk.ts and again in index.ts with identical values (last-write-wins, no observable difference).

Why this matters

Max-turns failures are precisely the runs where the transcript is most valuable — long, expensive runs that the user wants to inspect and recover partial work from. Losing them silently is a bad default.

When the Agent SDK's query() iterator throws — most notably on
max-turns exhaustion — the existing catch block re-throws before
reaching the writeFile/setOutput code below, so:

- $RUNNER_TEMP/claude-execution-output.json is never written
- the execution_file / session_id step outputs are never set
- any downstream step gated on steps.claude.outputs.execution_file
  (e.g. an S3 upload of the transcript) silently skips, and the
  partial transcript is lost

Fix by capturing the SDK error, writing the transcript and setting
the outputs unconditionally, then re-throwing so the step still
fails. Outputs are now published from within this function because
index.ts re-throws on failure before reaching its own setOutput calls.

Repro: a job that hits --max-turns 1000 exits with exit code 1,
no execution_file output, no log persisted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@STRd6 STRd6 force-pushed the fix-execution-file-on-error branch from c20646d to be3b21a Compare April 23, 2026 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant