Skip to content

fix(hermes): mirror builtin memory writes through on_memory_write#207

Open
ferminquant wants to merge 1 commit into
TencentCloud:mainfrom
ferminquant:fix/205-sub1-hermes-on-memory-write
Open

fix(hermes): mirror builtin memory writes through on_memory_write#207
ferminquant wants to merge 1 commit into
TencentCloud:mainfrom
ferminquant:fix/205-sub1-hermes-on-memory-write

Conversation

@ferminquant

Copy link
Copy Markdown

Description

The provider's on_memory_write hook was a no-op with a TODO. The intent — feed Hermes's builtin MEMORY.md / USER.md writes into the L1 index so dedup and L3 persona building can see them — was scaffolded but never wired. The two memory systems ran in parallel with no cross-pollination.

This routes the builtin write through client.capture() as a synthetic turn. The L1 extractor sees the content, the dedup layer can match against it, and L3 picks it up on the next persona refresh.

Refs #205 (sub-bullet 1 only; sub-bullet 2 is out of scope)

Change Type

  • Bug fix
  • New feature
  • Documentation update
  • Code optimization

Root Cause

plugin.yaml advertises two hooks:

hooks:
  - on_memory_write
  - on_session_end

The provider's on_session_end is implemented (it calls client.end_session() to flush the L1/L2/L3 pipeline on session end). The provider's on_memory_write was:

def on_memory_write(self, action: str, target: str, content: str) -> None:
    """Mirror built-in memory writes to memory-tencentdb for indexing."""
    # TODO: Implement mirroring of Hermes builtin MEMORY.md/USER.md writes
    # to memory-tencentdb's recall index for conflict suppression and dedup.
    pass

The TODO comment is the giveaway — unfinished work, not a deliberate "do nothing." The hook was scaffolded and abandoned. Effect:

  • A user preference Hermes writes to USER.md is invisible to the TencentDB L1 index.
  • The L1 dedup layer cannot catch a duplicate ("did we already record this preference?").
  • The L3 persona generator never sees the preference, because L3 is built from L1.
  • A user who repeats the same preference the next day gets a full LLM call instead of a recall.

So paying the L1/L2/L3 compute while the built-in memory layer is also writing means the two systems run in parallel with no cross-pollination — worst of both worlds. Same issue reporter flagged this as sub-bullet 1 of #205.

Changes

  • hermes-plugin/memory/memory_tencentdb/__init__.py — replace the pass body with a fire-and-forget call to client.capture(). The write is encoded as a synthetic turn with a (memory_write action=… target=…) prefix on user_content, which gives the L1 extractor enough signal to treat it as a memory write and a future LLM-side filter a marker to recognize and skip these entries during scene extraction.
  • hermes-plugin/memory/memory_tencentdb/tests/test_on_memory_write_mirror.py — 5 new tests:
    • test_writes_are_forwarded_to_capture — happy path: action, target, and content all reach client.capture() in the right shape, with session/user context threaded through
    • test_no_capture_when_gateway_unavailable — Gateway down → no call, no exception
    • test_no_capture_when_client_is_none — supervisor never started → no call, no exception
    • test_capture_failure_is_swallowed — capture raises → caller never sees the exception
    • test_pre_fix_hook_did_nothing — negative control: the test would have failed on main because a pass body never calls client.capture()

No new public API surface. No client-side change (the existing client.capture() is the right primitive). No gateway-side change. No new dependencies.

Defensive shape

The hook is fire-and-forget for two reasons:

  1. It is an event sink, not a request handler. Hermes's builtin layer is the authority for these writes — it has already updated MEMORY.md / USER.md by the time the hook fires. We are mirroring, not replicating. A mirror failure is a soft-loss of cross-system visibility, not a write failure.
  2. Raising would mislead the caller. If on_memory_write raises, Hermes's host wiring may surface an error to the user, or retry, or roll back the builtin write. None of those are right — the user only ever asked Hermes to remember, and Hermes did remember.

The fallback is automatic: the same content will re-surface through normal sync_turn() capture the next time it comes up in conversation, and the L1 extractor will dedup or extend as appropriate.

Self-test Checklist

  • Verified locally
  • No existing features affected
hermes-plugin/memory/memory_tencentdb/tests/test_on_memory_write_mirror.py .....  5 passed
hermes-plugin/memory/memory_tencentdb/tests/test_memory_tencentdb_recovery.py   16 passed

test_gateway_shutdown_leak.py was not run in this branch — pre-existing failures on main are unrelated to this hook (they concern supervisor teardown, not the L1 mirror path). Same scope discipline as #206.

Additional Notes

On the (memory_write …) prefix: The L1 extractor runs on the natural-turn framing by default. A synthetic entry with no marker could pollute the L1 store with what looks like a real conversation. The prefix gives downstream code a stable string to grep on if the natural-turn framing ever needs to be filtered out — e.g. if the L2 scene extractor starts grouping synthetic turns into scene blocks in unwanted ways. None of that filtering is implemented today; the marker is just future-proofing.

On skipping the L3 inference cost: This PR does not call /recall from on_memory_write. That is intentional. The L1 dedup / extraction pipeline runs on a schedule (pipeline.everyNConversations, default every 5 turns), and L3 runs every 50 new L1 memories. The write goes into L1 immediately on capture; L3 picks it up on its own schedule. Calling /recall here would burn an LLM call per write with no benefit, since L3 has its own trigger.

On the test pattern: The test file mirrors the sys.path injection used by test_l3_persona_injection.py and test_memory_tencentdb_recovery.py — it works under a hermes-agent checkout and skips cleanly in a plugin-only repo. Five focused tests, all mock-based, no real network or process spawning.

The provider's on_memory_write hook was a no-op with a TODO. The
intent — feed Hermes's builtin MEMORY.md / USER.md writes into the
L1 index so dedup and L3 persona building can see them — was
scaffolded but never wired. The two memory systems ran in parallel
with no cross-pollination: a user preference the builtin layer
recorded was invisible to the L1 store, so a later conversation
about the same preference would not be deduped and would not show
up in the L3 persona the agent sees.

This change routes the builtin write through client.capture() as
a synthetic turn. The L1 extractor sees the content, the dedup
layer can match against it, and L3 picks it up on the next persona
refresh. The (memory_write action=... target=...) prefix lets a
future LLM-side filter recognize and skip these entries during
scene extraction if the natural-turn framing becomes a problem.

The hook is fire-and-forget: failures are swallowed at the debug
log level. The builtin write already succeeded in Hermes's own
store; a mirror failure is a soft-loss of cross-system visibility,
not a write failure, and the next sync_turn() in the same session
will eventually re-surface the same content through normal
conversation capture.

Refs TencentCloud#205 (sub-bullet 1 only; sub-bullet 2 is out of scope)

Signed-off-by: Fermin Quant <ferminquant@users.noreply.github.com>
@Maxwell-Code07

Copy link
Copy Markdown
Collaborator

Thanks for the PR! We'll take a look at the on_memory_write hook implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants