fix(hermes): inject L3 persona into system_prompt_block by ferminquant · Pull Request #206 · TencentCloud/TencentDB-Agent-Memory

ferminquant · 2026-06-13T11:52:39Z

Description

The Hermes provider's system_prompt_block() returned a static string and ignored the L3 persona that the Gateway had already generated and exposed via the core's auto-recall pipeline. L3 compute was paid but the agent never saw the result.

This wires the existing data path end-to-end: the Gateway's /recall response now carries recalledL3_persona, and the Hermes provider fetches it (with a short TTL cache) and appends it to the system prompt as a ## User Persona block when present.

Refs #205 (partial — fixes the main bullet only; sub-bullets 1 and 2 are tracked separately and intentionally out of scope for this PR)

Change Type

Bug fix
New feature
Documentation update
Code optimization

Root Cause

Two missing links in the L3 delivery chain:

Gateway side — src/gateway/server.ts:handleRecall built the RecallResponse from appendSystemContext and the L1 count, but never copied result.recalledL3Persona (which the core had already populated in src/core/hooks/auto-recall.ts:238) into the wire response. The field existed in the core's RecallResult type and was being thrown away at the HTTP boundary.
Provider side — hermes-plugin/memory/memory_tencentdb/__init__.py:system_prompt_block returned a hard-coded string describing the four layers. Even if the Gateway had returned the persona, the provider would have ignored it. Verified by reading the function and by adding the negative-control test (TestNegativeControl::test_pre_fix_static_block_does_not_contain_persona), which would have failed on main before this fix.

Changes

src/gateway/types.ts — add optional recalledL3_persona?: string | null to RecallResponse. Documented as undefined-tolerant so older clients keep working.
src/gateway/server.ts — populate the new field in handleRecall from result.recalledL3Persona ?? null. One-line change.
hermes-plugin/memory/memory_tencentdb/__init__.py — system_prompt_block() now fetches /recall (empty query, user_id-keyed), caches the result for 60s, and appends a ## User Persona section when non-empty. Cache key is implicit on the provider instance, which Hermes re-creates per session, so a session switch cannot serve stale persona to the wrong scope.
- __init__ gains self._persona_cache: Dict[str, Any].
- New helpers _get_cached_persona() and _fetch_persona(). Both swallow exceptions so system_prompt_block() remains a pure function.
hermes-plugin/memory/memory_tencentdb/tests/test_l3_persona_injection.py — 9 new tests:
- 4 for the core behavior (static fallback, persona injected, empty-string treated as absent, null treated as absent)
- 3 for the cache (TTL hits, TTL expires, no-cache-on-failure for empty cache)
- 1 for "Gateway died mid-session" (fall back to cached value when a later fetch fails)
- 1 negative control (the regression test that would have failed on main)

Cache Semantics

Scenario	Behavior
First call, Gateway up, persona present	Fetch, cache for 60s, return persona
First call, Gateway up, no persona yet	Fetch, cache empty for 60s, return static block
First call, Gateway down	Return static block, do not advance `ts` — next call retries
Subsequent call within TTL, Gateway up	Return cached value, no network call
Subsequent call within TTL, Gateway down	Return cached value (no network)
Subsequent call after TTL, Gateway down	Return previous cached value, do not advance `ts`

The asymmetry on failure (don't advance ts on the first failure, but do fall back to a stale value on later failures) is intentional: a fresh-empty cache with a sick Gateway should not be stuck for 60 seconds before retrying; a warm cache with a sick Gateway should not retry on every turn.

Self-test Checklist

Verified locally
No existing features affected

hermes-plugin/memory/memory_tencentdb/tests/test_l3_persona_injection.py .......  9 passed
hermes-plugin/memory/memory_tencentdb/tests/test_memory_tencentdb_recovery.py ......  17 passed
hermes-plugin/memory/memory_tencentdb/tests/test_gateway_shutdown_leak.py ...  2 failed, 16 passed (pre-existing on main, not caused by this PR)

Pre-existing failures: test_gateway_shutdown_leak.py fails on main with 'NoneType' object has no attribute 'client' from __init__.py:785 (now line 793 after this PR). Confirmed by stashing this branch's changes and running the test on main — same failure, same line. Not in scope for #205.

Additional Notes

On the field name recalledL3_persona: I deliberately used snake_case (Python-style) on the wire, matching the existing Python client's reading style and the rest of the RecallResponse keys (memory_count, session_key). The core's internal field is camelCase (recalledL3Persona) — server.ts is the boundary that translates.

On query="" in _fetch_persona: The persona is keyed on user_id and the L3 pipeline runs on its own schedule, so the query field is irrelevant for the persona portion of the response. The Gateway's /recall handler accepts an empty query and will return whatever persona content the L3 pipeline has produced. If you'd rather have a no-op-only endpoint, the cleanest follow-up is a dedicated GET /persona route — happy to do that as a follow-up PR if you want, but it's not required for this fix.

On the 60s TTL: Arbitrary; the L3 pipeline regenerates every 50 new L1 memories by default, so 60s is a comfortable lower bound. Configurable via self._PERSONA_CACHE_TTL_SECS if you want a different default — left as a class attribute rather than a config-schema change to keep this PR narrow.

The Hermes provider's system_prompt_block() returned a static string describing the memory layers, ignoring the L3 persona that the Gateway had already generated and surfaced via the core's auto-recall pipeline. L3 compute was paid but the agent never saw the result, so every conversation started from cold on the persona dimension. This change wires the existing data path end-to-end: * The Gateway's /recall response now includes recalledL3_persona, populated from the auto-recall pipeline (already populated on the core side, just not exposed over the wire). * The Hermes provider's system_prompt_block() fetches it via the existing /recall client, caches for 60s keyed on the provider instance (which Hermes re-creates per session), and appends it to the system prompt as a '## User Persona' block when present. * On the very first call, a failed /recall leaves the cache empty and the next call retries — so a Gateway that is down at startup does not poison the cache. Once a fetch succeeds, subsequent failures within the TTL fall back to the cached value rather than retrying on every turn. No new external dependencies. No new public API surface beyond the one new optional field on the Gateway /recall response. Older Gateways that predate the field are tolerated: clients read the field with .get() and treat its absence as 'no persona available'. Refs TencentCloud#205 (partial — fixes the main bullet only; sub-bullets 1 and 2 are tracked separately and out of scope for this PR) Signed-off-by: Fermin Quant <ferminquant@users.noreply.github.com>

Maxwell-Code07 · 2026-06-13T14:38:37Z

Thanks for the contribution! We'll review the L3 persona injection approach.

ferminquant force-pushed the fix/205-hermes-l3-persona-system-prompt branch from 3332df1 to e18283b Compare June 13, 2026 12:06

ferminquant mentioned this pull request Jun 13, 2026

fix(hermes): mirror builtin memory writes through on_memory_write #207

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hermes): inject L3 persona into system_prompt_block#206

fix(hermes): inject L3 persona into system_prompt_block#206
ferminquant wants to merge 1 commit into
TencentCloud:mainfrom
ferminquant:fix/205-hermes-l3-persona-system-prompt

ferminquant commented Jun 13, 2026 •

edited

Loading

Uh oh!

Maxwell-Code07 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ferminquant commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Change Type

Root Cause

Changes

Cache Semantics

Self-test Checklist

Additional Notes

Uh oh!

Maxwell-Code07 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ferminquant commented Jun 13, 2026 •

edited

Loading