feat: implement state persistence by 35C4n0r · Pull Request #177 · coder/agentapi

35C4n0r · 2026-01-31T16:52:12Z

MergeAfter: #172

…oad state

github-actions · 2026-02-03T10:59:14Z

✅ Preview binaries are ready!

To test with modules: agentapi_version = "agentapi_177" or download from: https://github.com/coder/agentapi/releases/tag/agentapi_177

mafredri

Nice work so far! I left a few comments and suggestions. Mainly about moving some concerns from httpapi to cmd/server. I think it would be helpful to have some tests based on real agent session restoration output to evaluate the screentracker changes.

cmd/server/server.go

lib/httpapi/server_signals_unix.go

lib/httpapi/server.go

mafredri · 2026-02-03T13:29:33Z

cmd/server/signals_windows.go

+func (s *Server) HandleSignals(ctx context.Context, process *termexec.Process) {
+	// Handle shutdown signals (SIGTERM, SIGINT only on Windows)
+	shutdownCh := make(chan os.Signal, 1)
+	signal.Notify(shutdownCh, os.Interrupt, syscall.SIGTERM)


Does this compile on Windows? IIRC we can only support os.Interrupt there.

compiles - yes (since PR Preview Build / Build Release Binaries (pull_request) passes), but haven't tested it.

# Conflicts: # cmd/server/server.go # lib/httpapi/server.go # lib/screentracker/conversation.go # lib/screentracker/pty_conversation.go # lib/screentracker/pty_conversation_test.go

# Conflicts: # lib/httpapi/server.go # lib/screentracker/pty_conversation.go

mafredri

Test coverage seems good, but I would really like to see real testdata created for e.g. Claude where all output from the initial conversation, and then again for the restoration part. And testing that AgentAPI does in fact handle this correctly.

I'm not sure how feasible it is, but ideally it'd be nice to capture everything, including control characters, that Claude outputs. Think asciinema recording.

Also, I think we need to really consider everything that can affect the AI agent output. For instance, given the nature of adjustScreenAfterStateLoad, what if --term-height and --term-width have been adjusted between invocations? If the state is being restored, we should probably do so much earlier (in runServer) and print warnings that these options are being overridden by the state restoration and forcing the previous options. Wdyt?

lib/screentracker/pty_conversation.go

cmd/server/server.go

lib/screentracker/pty_conversation.go

cmd/server/signals.go

mafredri · 2026-02-18T09:06:59Z

lib/screentracker/pty_conversation.go

+	// Store the first stable snapshot for filtering later
+	snapshots := c.snapshotBuffer.GetAll()
+	if len(snapshots) > 0 {
+		c.firstStableSnapshot = c.cfg.FormatMessage(strings.TrimSpace(snapshots[len(snapshots)-1].screen), "")


I suppose we need a FormatMessage nil check here?

Also, what if FormatMessage function has changed between session save and restore (different AgentAPI versions). Potential issue?

Also, what if FormatMessage function has changed between session save and restore (different AgentAPI versions). Potential issue?

This doesn't pose an issue, c.firstStableSnapshot is not stored in the state, it is used to strip the conversation history loaded by the agent in case of --continue/--resume so that we can load our saved state messages.

lib/screentracker/pty_conversation.go

cmd/server/server.go

lib/screentracker/pty_conversation.go

cmd/server/server.go

35C4n0r · 2026-02-19T06:34:13Z

Also, I think we need to really consider everything that can affect the AI agent output.

Agreed

For instance, given the nature of adjustScreenAfterStateLoad, what if --term-height and --term-width have been adjusted between invocations? If the state is being restored, we should probably do so much earlier (in runServer) and print warnings that these options are being overridden by the state restoration and forcing the previous options. Wdyt?

imo, this is not an issue; nothing in adjustScreenAfterStateLoad will be affected by height/width. Its only role rn is to strip the conversation history loaded by the agent when using --continue/--resume.

mafredri · 2026-02-19T09:05:04Z

imo, this is not an issue; nothing in adjustScreenAfterStateLoad will be affected by height/width. Its only role rn is to strip the conversation history loaded by the agent when using --continue/--resume.

So we don't need to think about how a terminal size affects line-splitting/ASCII formatting?

lib/screentracker/pty_conversation.go

johnstcn · 2026-02-19T10:20:42Z

lib/screentracker/pty_conversation.go

+
+		if c.initialPromptReady && !c.loadStateSuccessful && c.cfg.StatePersistenceConfig.LoadState {
+			_ = c.loadState()
+			c.loadStateSuccessful = true


Outright failure will cause the session to be essentially "bricked" until you fix, rename, or move the state file. We should definitely surface the error, but complete failure is not ideal.

35C4n0r · 2026-02-19T14:29:34Z

@johnstcn @mafredri Now the user will see an error like this:

Agentapi will work as usual; for the previous corrupted state, everything will become AgentMessage (i.e. previous states would lose the differentiation b/w agent and user message).

Ref:

mafredri · 2026-02-20T10:31:42Z

Deep Code Review

Overall risk: Medium. The state file format decisions are one-way doors, and there are a few logic bugs that could produce incorrect behavior in edge cases.

The architecture is sound and the reviewer feedback from mafredri and johnstcn has been largely addressed (PID file moved to cmd/server, signal handlers split per-platform, messagesLocked extracted, file permissions tightened, json.NewDecoder adopted). Several issues remain.

P2-1: `ConversationMessage` lacks JSON struct tags — fragile state file format

File: lib/screentracker/conversation.go:86-91

Without json tags, encoding/json uses PascalCase ("Id", "Message", etc.). The HTTP API's MessageUpdateBody uses lowercase (json:"id", json:"message"). This creates an inconsistency between the state file and the API.

More critically, this is a one-way door: once state files are written with PascalCase keys, adding json:"id" tags later (to align with API conventions) silently breaks deserialization of existing state files. Add explicit json tags now before any state files are created in production.

P2-2: No version validation on state load

File: lib/screentracker/pty_conversation.go:641-649

AgentState.Version is written as 1 during save but never checked during load. If the format changes in a future version, loading an incompatible state file would silently misinterpret data. Add a check:

if agentState.Version != 1 {
    return xerrors.Errorf("unsupported state file version %d (expected 1)", agentState.Version)
}

P2-3: `initialPromptSent` logic silently drops prompt for same-prompt restore with unsent state

File: lib/screentracker/pty_conversation.go:655-658

c.initialPromptSent = agentState.InitialPromptSent  // line 655
if len(c.cfg.InitialPrompt) > 0 {
    isDifferent := buildStringFromMessageParts(c.cfg.InitialPrompt) != agentState.InitialPrompt
    c.initialPromptSent = !isDifferent  // line 658: unconditionally overwrites line 655
}

When the same prompt is provided and the saved state has InitialPromptSent: false (saved during init before prompt delivery), line 658 forces initialPromptSent = true, silently preventing the prompt from ever being sent. The correct logic:

if isDifferent {
    c.initialPromptSent = false
} else {
    c.initialPromptSent = agentState.InitialPromptSent  // preserve saved status
}

P2-4: `adjustScreenAfterStateLoad` echoes user message as agent response

File: lib/screentracker/pty_conversation.go:692-693

if !c.userSentMessageAfterLoadState && len(c.messages) > 0 {
    newScreen = "\n" + c.messages[len(c.messages)-1].Message
}

This doesn't check the role of the last message. If the saved state ends with a user message (user sent a message right before shutdown, agent hadn't responded yet), the user's text becomes the agent message via updateLastAgentMessageLocked at line 302. Fix:

if !c.userSentMessageAfterLoadState && len(c.messages) > 0 &&
    c.messages[len(c.messages)-1].Role == ConversationRoleAgent {
    newScreen = "\n" + c.messages[len(c.messages)-1].Message
}

P2-5: `EmitError` uses untyped `string` for level, inconsistent with codebase enum pattern

Files: lib/screentracker/conversation.go:83, lib/httpapi/events.go:56-60

Every other categorical field in the API (ConversationStatus, ConversationRole, AgentStatus, EventType) uses typed string enums with OpenAPI schema registration. ErrorBody.Level is just string, weakening the API contract. Define type ErrorLevel string with ErrorLevelWarning / ErrorLevelError constants and a Schema() method to match the established pattern.

P3-1: Empty restored prompt can trigger sending empty content to agent

File: lib/screentracker/pty_conversation.go:659-664

When saved state has InitialPrompt: "" and no new prompt is provided, a []MessagePart{MessagePartText{Content: ""}} is created. At line 219, len(c.cfg.InitialPrompt) > 0 is true (1 element), so this empty message is enqueued and a carriage return is sent to the agent. Guard with:

if agentState.InitialPrompt != "" {
    c.cfg.InitialPrompt = []MessagePart{MessagePartText{Content: agentState.InitialPrompt}}
}

P3-2: `errors` slice in `EventEmitter` grows unboundedly

File: lib/httpapi/events.go:76,215-216,239-245

Every EmitError call appends to e.errors, which is replayed to all new SSE subscribers. Combined with the frontend's duration: Infinity toasts, reconnecting clients see all historical errors as new toast notifications. Cap the slice or don't replay stale errors to new subscribers.

P3-3: `ErrorBody` uses `time.Now()` instead of quartz clock

File: lib/httpapi/events.go:212

The codebase consistently uses quartz.Clock for testability. EmitError calls time.Now() directly. Add a quartz.Clock to EventEmitter (via WithClock option, matching the existing option pattern).

P3-4: `ctx` variable shadowing in `NewServer` disconnects conversation from caller context

File: lib/httpapi/server.go:280

ctx, cancel := context.WithCancel(context.Background()) shadows the function parameter ctx. NewPTY at line 259 receives the original ctx, but conversation.Start at line 308 receives the shadowed ctx. Rename to shutdownCtx, shutdownCancel for clarity.

P3-5: PID file has no stale-PID detection on startup

File: cmd/server/server.go:140-143,271-287

writePIDFile() silently overwrites any existing PID file without checking if the PID references a running process. After a crash (which skips defer cleanupPIDFile), scripts relying on the PID file could target the wrong process. Consider checking if the old PID is alive on startup and logging a warning.

P3-6: Restored prompt loses terminal formatting (hidden parts)

File: lib/screentracker/pty_conversation.go:576,660-664

The saved InitialPrompt is built from buildStringFromMessageParts, which calls .String() on each part. For MessagePartText with Hidden: true, .String() returns "". So the saved prompt loses hidden escape sequences (bracketed paste mode, echo prevention). On restore, the prompt is sent as plain text without terminal formatting. This only matters in the rare case where state is saved before the initial prompt was sent, but it's worth documenting as a known limitation.

P4-1: `SaveState()` holds lock during all filesystem I/O

File: lib/screentracker/pty_conversation.go:553-612

Acknowledged by previous reviewer as acceptable for now. The lock scope could be narrowed: copy data under lock, release, then do I/O.

P4-2: Windows `syscall.SIGTERM` is effectively dead code

File: cmd/server/signals_windows.go:20

Only os.Interrupt is meaningful on Windows. syscall.SIGTERM is harmless but misleading.

Questions

State file format stability: Is this the right time to nail down the wire format (add JSON tags to ConversationMessage, validate version)? Or is the format expected to change frequently during development?
adjustScreenAfterStateLoad approach: The strings.Replace + "return last saved message" strategy is fragile. Has real-agent testing (actual Claude/Aider output, not echo) validated this approach works reliably?
Empty initial prompt edge case: Is there a valid scenario where InitialPrompt is empty in the saved state? If so, should the restore path be a no-op?

Test Coverage Observations

adjustScreenAfterStateLoad() has no direct unit tests. It's only exercised indirectly, and with assert.Contains rather than assert.Equal, which masks bugs in screen adjustment.
EmitError / EventTypeError has zero unit test coverage (no tests for broadcasting, replay to new subscribers, or accumulation).
The testEmitter in unit tests silently discards errors via EmitError(_ string, _ string) {}, so error emissions from loadStateLocked are never verified.
E2E tests use time.Sleep for synchronization (lines 142, 158, 206, 225, 272), which is fragile on slow CI. Polling for expected state would be more robust.
Runtime CLI validation (--load-state requires --state-file) is not tested because runServer calls os.Exit(1).

Suggested Validation Plan

go test ./...
go test -race ./lib/screentracker/... ./lib/httpapi/... ./cmd/server/...

Manual:

Start with --state-file /tmp/test-state.json, send messages, send SIGUSR1, verify state file
Restart with same flags, verify conversation restored
Restart with different --initial-prompt, verify new prompt sent
Kill with SIGKILL, verify previous save is intact
Inspect state file JSON: check field casing
Connect second SSE client after state load failure, verify error received

cmd/server/server.go

mafredri · 2026-02-20T09:17:39Z

cmd/server/server.go

+	// Stop the HTTP server
+	shutdownCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+	defer cancel()
+	if err := srv.Stop(shutdownCtx); err != nil {
+		logger.Error("Failed to stop HTTP server", "error", err)
+	}


We could consider moving this into case <-gracefulCtx.Done(): above? I'm guessing we don't have to call srv.Stop if we receive on serverErrCh.

Does stop error if the server already closed? If yes, we'll end up printing a misleading error here.

EDIT: I looked at Stop and the once does guard against multiple Stops, but not against Stop producing an error if the server closed before?

We return early if we receive anything(non nil) on serverErrCh

select { case err := <-serverErrCh: if err != nil { return xerrors.Errorf("failed to start server: %w", err)

I looked at Stop and the once does guard against multiple Stops, but not against Stop producing an error if the server closed before?

True, I had an alternative in mind, but I decided to proceed with once here for the above-mentioned reason. I'll add a check in Stop
I'll check for this ErrServerClosed, and return nil in that case.

From docs Once Shutdown has been called on a server, it may not be reused; future calls to methods such as Serve will return ErrServerClosed.

mafredri · 2026-02-20T09:19:42Z

cmd/server/server.go

+	}
+
+	// Close the process
+	if err := process.Close(logger, 5*time.Second); err != nil {


What's the result of Close if the process already exited? Should we check processExitCh first?

Moved this to the select-case block below.

select { case err := <-processExitCh: if err != nil { return xerrors.Errorf("agent exited with error: %w", err) } default: // Close the process if err := process.Close(logger, 5*time.Second); err != nil { logger.Error("Failed to close process cleanly", "error", err) } }

e2e/echo_test.go

lib/screentracker/pty_conversation.go

mafredri · 2026-02-20T09:53:20Z

lib/screentracker/pty_conversation.go

+
+		// Enqueue initial prompt once after agent is ready (and after state is potentially loaded)
+		if c.initialPromptReady && len(c.cfg.InitialPrompt) > 0 && !c.initialPromptSent {
+			c.outboundQueue <- outboundMessage{parts: c.cfg.InitialPrompt, errCh: nil}


Does this play nicely with the logic around userSentMessageAfterLoadState? If possible, I'd like to see that state flag removed as it adds a bit of confusion about how the whole pipeline interacts.

The scenario I have in mind: Resume with a new "initial prompt", what happens? Will the new prompt be sent? Will it bypass userSentMessageAfterLoadStat?

Resume with a new "initial prompt", what happens? Will the new prompt be sent? Will it bypass userSentMessageAfterLoadStat

Yes, it will be sent (given that it's not the same as the initialPrompt in the stateFile).

No, it won't bypass userSentMessageAfterLoadState. The userSentMessageAfterLoadState is set in sendMessage, and sendMessage is called for everything that comes in via outboundQueue channel (this is done in a go func in Start).

lib/screentracker/pty_conversation.go

mafredri · 2026-02-20T10:04:35Z

lib/httpapi/server.go

 				s.logger.Error("Failed to send event", "subscriberId", subscriberId, "error", err)
 				return
 			}
+		case <-s.shutdownCtx.Done():


Potential future enhancement, if we set the http server base context to shutdownCtx, we only need to handle the request context as it will inherit from the base context. (Needs verification if there's a benefit here though.)

e2e/echo_test.go

johnstcn and others added 5 commits January 22, 2026 13:56

chore(lib): extract Conversation interface

e3bd936

Merge branch 'main' into cj/refactor-conversation

e5f1bda

feat: implement state persistence

a0f8bb5

feat: pid file writing and clearing and improved error handling for l…

ca3cdff

…oad state

refactor: remove redundant save logic

1c224e9

35C4n0r self-assigned this Feb 1, 2026

35C4n0r marked this pull request as ready for review February 1, 2026 15:52

35C4n0r marked this pull request as draft February 1, 2026 15:52

feat: improve logic for first run with empty state file

30f82d7

35C4n0r changed the base branch from main to cj/refactor-conversation-orig February 3, 2026 08:51

35C4n0r marked this pull request as ready for review February 3, 2026 08:54

35C4n0r marked this pull request as draft February 3, 2026 08:54

35C4n0r closed this Feb 3, 2026

35C4n0r reopened this Feb 3, 2026

feat: implement platform-specific signal handling

12bed1c

mafredri self-requested a review February 3, 2026 13:09

mafredri reviewed Feb 3, 2026

View reviewed changes

mafredri mentioned this pull request Feb 4, 2026

Registry: AgentAPI base module shutdown script and state persistence coder/internal#1257

Open

7 tasks

35C4n0r changed the base branch from cj/refactor-conversation-orig to main February 4, 2026 10:52

blinkagent bot mentioned this pull request Feb 4, 2026

Improve agentapi module abstraction for configuration and flags coder/registry#696

Open

35C4n0r added 2 commits February 5, 2026 18:11

feat: refactor cfg -> Config and move pid ops to server

e366e8b

feat: unregister the signal handlers on teardown

26fdf81

mafredri requested a review from SasSwart February 10, 2026 13:38

35C4n0r added 5 commits February 16, 2026 15:23

Merge branch 'main' into 35C4n0r/agentapi-state-persistence

021e33f

# Conflicts: # cmd/server/server.go # lib/httpapi/server.go # lib/screentracker/conversation.go # lib/screentracker/pty_conversation.go # lib/screentracker/pty_conversation_test.go

feat: resolve conflicts and improve shutdown sequence

5795db7

Merge branch 'main' into 35C4n0r/agentapi-state-persistence

b44fe5d

# Conflicts: # lib/httpapi/server.go # lib/screentracker/pty_conversation.go

feat: resolve conflicts

9deab88

chore: not dirty after load state

18fb1e4

35C4n0r marked this pull request as ready for review February 17, 2026 09:43

feat: add tests

b719dac

35C4n0r requested a review from mafredri February 17, 2026 10:48

35C4n0r added 2 commits February 17, 2026 16:24

feat: remove comment

3959002

feat: remove comments

7e389d2

mafredri reviewed Feb 18, 2026

View reviewed changes

mafredri requested a review from johnstcn February 18, 2026 09:45

johnstcn reviewed Feb 18, 2026

View reviewed changes

35C4n0r added 6 commits February 18, 2026 22:29

wip: address comments

1d7aaed

feat: remove anti-pattern for graceful shutdown

058b18f

feat: remove additional message upon load state fail

2565a3c

wip: apply suggestions from cian

1033cd7

wip: apply suggestions from cian

cfb7601

feat: update tests

9d7eb5a

35C4n0r added 2 commits February 19, 2026 16:16

feat: improved initial prompt handling

759ec53

chore: comments

03c6f16

johnstcn reviewed Feb 19, 2026

View reviewed changes

35C4n0r added 3 commits February 19, 2026 16:44

chore: address cian's file permission comments

bd75240

feat: implement error handling for agent events

b1ab615

fix: no screen adjustment in case of loadState failure

31d27a7

feat: add three e2e tests for statePersistence

220d360

35C4n0r requested review from johnstcn and mafredri February 20, 2026 07:44

mafredri reviewed Feb 20, 2026

View reviewed changes

feat: address maf's review

eef927d

35C4n0r requested a review from mafredri February 20, 2026 16:18

Comments

Conversation

35C4n0r commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mafredri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

35C4n0r commented Feb 19, 2026

Uh oh!

mafredri commented Feb 19, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

35C4n0r commented Feb 19, 2026

Uh oh!

mafredri commented Feb 20, 2026

Deep Code Review

P2-1: ConversationMessage lacks JSON struct tags — fragile state file format

P2-2: No version validation on state load

P2-3: initialPromptSent logic silently drops prompt for same-prompt restore with unsent state

P2-4: adjustScreenAfterStateLoad echoes user message as agent response

P2-5: EmitError uses untyped string for level, inconsistent with codebase enum pattern

P3-1: Empty restored prompt can trigger sending empty content to agent

P3-2: errors slice in EventEmitter grows unboundedly

P3-3: ErrorBody uses time.Now() instead of quartz clock

P3-4: ctx variable shadowing in NewServer disconnects conversation from caller context

P3-5: PID file has no stale-PID detection on startup

P3-6: Restored prompt loses terminal formatting (hidden parts)

P4-1: SaveState() holds lock during all filesystem I/O

P4-2: Windows syscall.SIGTERM is effectively dead code

Questions

Test Coverage Observations

Suggested Validation Plan

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

35C4n0r commented Jan 31, 2026 •

edited

Loading

P2-1: `ConversationMessage` lacks JSON struct tags — fragile state file format

P2-3: `initialPromptSent` logic silently drops prompt for same-prompt restore with unsent state

P2-4: `adjustScreenAfterStateLoad` echoes user message as agent response

P2-5: `EmitError` uses untyped `string` for level, inconsistent with codebase enum pattern

P3-2: `errors` slice in `EventEmitter` grows unboundedly

P3-3: `ErrorBody` uses `time.Now()` instead of quartz clock

P3-4: `ctx` variable shadowing in `NewServer` disconnects conversation from caller context

P4-1: `SaveState()` holds lock during all filesystem I/O

P4-2: Windows `syscall.SIGTERM` is effectively dead code