-
Notifications
You must be signed in to change notification settings - Fork 52
fix(templates): give TypeScript templates per-session in-process short-term memory #1666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
21266e3
ccb52f3
1540ed6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -53,18 +53,35 @@ async function getOrCreateAgent(sessionId: string, actorId: string): Promise<Age | |
| return agent; | ||
| } | ||
| {{else}} | ||
| let cachedAgent: Agent | null = null; | ||
| const AGENT_CACHE_LIMIT = 128; | ||
|
|
||
| async function getOrCreateAgent(): Promise<Agent> { | ||
| if (!cachedAgent) { | ||
| const model = await loadModel(); | ||
| cachedAgent = new Agent({ | ||
| model, | ||
| systemPrompt: SYSTEM_PROMPT, | ||
| tools, | ||
| }); | ||
| // Reuses one Agent per sessionId so each session keeps its own in-process | ||
| // conversation history (best-effort; resets on cold start). A Map preserves | ||
| // insertion order, so it doubles as an LRU bounded to 128 sessions — a local | ||
| // dev process serving many sessions cannot leak history between them or grow | ||
| // without bound. On AgentCore Runtime each microVM serves a single session, so | ||
| // this holds one entry. For durable history, attach memory. | ||
| const agentCache = new Map<string, Agent>(); | ||
|
|
||
| async function getOrCreateAgent(sessionId: string): Promise<Agent> { | ||
| const existing = agentCache.get(sessionId); | ||
| if (existing) { | ||
| agentCache.delete(sessionId); | ||
| agentCache.set(sessionId, existing); | ||
| return existing; | ||
| } | ||
| if (agentCache.size >= AGENT_CACHE_LIMIT) { | ||
| const oldest = agentCache.keys().next().value; | ||
| if (oldest !== undefined) agentCache.delete(oldest); | ||
| } | ||
| return cachedAgent; | ||
| const model = await loadModel(); | ||
| const agent = new Agent({ | ||
| model, | ||
| systemPrompt: SYSTEM_PROMPT, | ||
| tools, | ||
| }); | ||
| agentCache.set(sessionId, agent); | ||
| return agent; | ||
| } | ||
| {{/if}} | ||
|
|
||
|
|
@@ -76,7 +93,8 @@ const app = new BedrockAgentCoreApp({ | |
| const actorId = getActorId(payload, context); | ||
| const agent = await getOrCreateAgent(sessionId, actorId); | ||
| {{else}} | ||
| const agent = await getOrCreateAgent(); | ||
| const sessionId = context?.sessionId ?? 'default-session'; | ||
| const agent = await getOrCreateAgent(sessionId); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mid-stream failure on a cached Strands Looking at // Normalize input and append user messages on first invocation only
if (currentArgs !== undefined) {
const messagesToAppend = this._normalizeInput(currentArgs);
for (const message of messagesToAppend) {
yield this._appendMessage(message, invocationState); // <-- user msg committed here
}
...
}
...
const modelResult = yield* this._invokeModel(invocationState, structuredOutputChoice);The outer Before this PR, A few ways to fix:
Option 1 matches the SDK's own "keep Worth verifying by reproducing the same way the VercelAI case was: force a provider error on turn 1 (e.g. invalid model id, or temporarily throttled), then send turn 2 on the same
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 1540ed6 with Option 1 (snapshot/restore via the SDK's I reproduced it as you suggested and want to share the nuance I found, because it changes the severity (not the fix): The dangling user turn is real — after a forced turn-1 failure on a cached agent, But whether it actually breaks is provider-dependent:
Since the template ships all four providers, I fixed it regardless rather than rely on Bedrock's leniency. The guard snapshots const snapshot = agent.takeSnapshot({ include: ['messages'] });
try {
for await (const event of agent.stream(payload.prompt ?? '')) { /* … */ }
} catch (error) {
agent.loadSnapshot(snapshot);
throw error;
}Verified end-to-end through the |
||
| {{/if}} | ||
|
|
||
| {{#if hasMemory}} | ||
|
|
@@ -97,14 +115,26 @@ const app = new BedrockAgentCoreApp({ | |
| await agent.memoryManager?.flush(); | ||
| } | ||
| {{else}} | ||
| for await (const event of agent.stream(payload.prompt ?? '')) { | ||
| if ( | ||
| event.type === 'modelStreamUpdateEvent' && | ||
| event.event?.type === 'modelContentBlockDeltaEvent' && | ||
| event.event.delta?.type === 'textDelta' | ||
| ) { | ||
| yield { data: event.event.delta.text }; | ||
| // Snapshot history before streaming so a failed turn can be rolled back. | ||
| // Agent.stream() appends the user message before invoking the model; on a | ||
| // mid-stream error that user turn would otherwise linger in the cached | ||
| // agent, and the next turn for this session would send consecutive user | ||
| // messages (rejected by providers that require strict role alternation, | ||
| // e.g. Anthropic). Restoring on error keeps the session reusable. | ||
| const snapshot = agent.takeSnapshot({ include: ['messages'] }); | ||
| try { | ||
| for await (const event of agent.stream(payload.prompt ?? '')) { | ||
| if ( | ||
| event.type === 'modelStreamUpdateEvent' && | ||
| event.event?.type === 'modelContentBlockDeltaEvent' && | ||
| event.event.delta?.type === 'textDelta' | ||
| ) { | ||
| yield { data: event.event.delta.text }; | ||
| } | ||
| } | ||
| } catch (error) { | ||
| agent.loadSnapshot(snapshot); | ||
| throw error; | ||
| } | ||
| {{/if}} | ||
| }, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq: do we need to do any clean up here aside from just purging oldest agent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question — no, purging the oldest agent is sufficient; there's nothing else to tear down.
I checked what a Strands TS
Agentholds and what the template gives it:Agenthas noclose/dispose/[Symbol.asyncDispose]— it's a plain object holding in-memory state (messages,modelState, the tool/conversation registries). Dropping the last reference makes it GC-eligible; there are no per-agent sockets, file handles, or timers to release.const mcpClients = [...],const tools = [...]at top scope), shared by reference across every cached agent. They're intentionally not per-agent, so eviction must not close them (other live sessions are still using them); they persist for the process lifetime by design.So eviction is just
deleteof the map entry, matching the Pythonagent_factory(cache.popitem()with no teardown). If a future change gave each agent its own MCP client, that's when we'd need a disposal hook — but that's not the case here.