fix(security): R89-167b — neutralize stored fields in MCP prompts + export formatters#48
Merged
Merged
Conversation
…xport formatters H R89-164h self-audit validated that _sanitize_inline (closed inject_claude_md in R89-132b) did NOT cover two other emit sinks. A threshold-promoted rule's pattern/explain were embedded RAW into agent-instruction context: - instinct_rules prompt (server.py) — MED-001 - instinct_suggestions prompt (server.py) — MED-002 - export_platform _fmt_* formatters — LOW-001 (disk-destined) A newline in a stored explain/pattern broke the bullet and injected an attacker-controlled instruction (indirect prompt injection, session-scoped, threshold-gated via observe 10/5). Fix = reuse the SAME sink defense (InstinctStore._sanitize_inline) at every emit point before embedding the field. Scope is neutralize-on-emit ONLY — detection/promotion logic and thresholds are untouched. category is also neutralized in the two formatters that render it (parity with inject_claude_md). 18 regression tests (red->green): newline/heading injection closed across both prompts and all 4 export formats; legit single-line explains render intact (FP guard). Full suite 214 passed, ruff clean, cursor-rules in sync.
3b7acc7 to
0e30940
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes a stored indirect prompt-injection gap surfaced by an R89-164h self-audit (supermemory-triggered) and validated against the hypothesis:
InstinctStore._sanitize_inline— the sink defense that closedinject_claude_mdin R89-132b (INSTINCT-M-001) — was not applied to two other emit sinks. A threshold-promoted rule'spattern/explainwere embedded raw into agent-instruction context.instinct_rulesMCP prompt (server.py)instinct_suggestionsMCP prompt (server.py)export_platform_fmt_claude_md/_fmt_cursorrules/_fmt_windsurfrules/_fmt_codex(store.py, disk-destined)Attack path (session-scoped, threshold-gated)
Fix
Reuse the same neutralization already proven for
inject_claude_md(
InstinctStore._sanitize_inline: collapse C0/C1 control chars incl. CR/LF/TAB → space,backtick →
', break<!--/-->fences) at every emit point, before embedding the field.categoryis also neutralized in the two formatters that render it (_fmt_claude_md,_fmt_windsurfrules) for parity with theinject_claude_mdsink (no-op for the validated/closed category enum; defense-in-depth vs. a directly-poisoned DB row).inject_claude_md(already fixed, INSTINCT-M-001) untouched. Pure tool-response queries (suggest/list/get/search/export_rules/export_skill/export_claude_md) are out of scope (bounded tool-response data; agent must explicitly act).Tests (red → green)
tests/test_prompt_injection_r89_167b.py— 18 regression tests:## SYSTEM:) injection closedVerify
pytest tests/→ 214 passed (was 201 + 13 newly-passing injection tests)ruff check src/ tests/→ cleanpython tools/sync_cursor_rules.py --check→ in sync (proves the formatter change is a no-op for legitimate data)Published-package security fix — version-bump / release are operator-gated.