Cross-cutting: chat-template backdoor classification (LLM03 ↔ LLM04) + horizontal runtime governance pattern (LLM03/04/06/08) #68
Replies: 2 comments 1 reply
-
|
Whew! Thanks for the comprehensive write-up! Two quick reads, with the deeper conversation moving to Slack/our call on 5/20. I agree with the chat-template cross-reference, in principle. Tom's threat-actor/entry-point lens looks right for vulnerabilities that span supply-chain origins and inference-time exploitation. @KeystoneSmartQuotes can you please attach the May 4 cross-reference document directly to this discussion so we're all reading from the same draft. One question worth raising... should the LLM03/LLM04 pointer name chat templates specifically, or generalize to bundled non-weight inference-time artifacts (templates plus Now, for the horizontal runtime defense-in-depth pattern, I want to push back on anchoring four entries in LLM06 Mitigation 7. The more we anchor there, the more it looks like an Agentic Top 10 entry. Remember, our charter is "line in the sand is model-in-isolation vs. system-coordinating-components. Async in Slack, be sure to tag me and Steve if you are looking for specific input from us (so many Slacks... so... so many Slacks...). If you want to sync up on a call before 5/20, happy to do so. |
Beta Was this translation helpful? Give feedback.
-
|
The cross-reference makes sense because chat-template backdoors sit at an awkward boundary: they are supply-chain artifacts, but their runtime effect is behavioral control. I would classify them with two linked labels rather than forcing a single home:
That lets LLM03 describe how the artifact enters the system, while LLM04 explains how the embedded instruction changes model behavior after loading. The same pattern also applies to poisoned tool descriptions, malicious skill files, and unsafe default system prompts. For the runtime governance pattern, I would recommend one concrete control language across the cross-reference: separate trusted instructions, untrusted content, tool metadata, and generated outputs into explicit trust zones, then enforce action-boundary checks before side effects. That makes the mitigation actionable across LLM03, LLM04, LLM06, and LLM08 rather than only taxonomic. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Following the April 29 thread discussion in #team-genai-top-10-llm-llm04 (with Mark Roxberry, Anitha Dakamarri, Ariel Fogel, and Tom Kompare on chat-template scope) and the May 4 horizontal cross-reference document I circulated to that channel ("Action-Boundary Policy Evaluation: A Cross-Cutting Cross-Reference for LLM 2026"), opening this thread for Sprint 2 cross-linking discussion.
Two cross-cutting items:
The LLM04 merged Description correctly captures chat-template backdoors as a bundled-non-weight-artifact concern under the persistence axis. The LLM03 entry covers the supply-chain origin (distributors, registries, model files). Per Tom Kompare's threat-actor / entry-point decomposition: chat-template backdoors are jointly classified — supply-chain origin (LLM03) plus inference-time persistence (LLM04). Readers arriving at either entry would benefit from a brief reciprocal pointer to the other.
The LLM04 substantive deep-dive should stay where it is. A short cross-reference language pair (LLM03 to LLM04 for "see LLM04 for inference-pipeline persistence detail" and LLM04 to LLM03 for "see LLM03 for supply-chain origin context") preserves entry navigation without duplication.
Several entries describe the same architectural defense pattern from different angles:
The horizontal pattern is consistent. Keeping the canonical control body in LLM06 Mitigation 7 and having LLM03/04/08 reference it via short cross-reference language (rather than each entry duplicating the control description) preserves consistency and avoids fragmentation.
Cross-reference language proposals for each direction are in the document circulated to the LLM04 channel May 4. Happy to share the draft document directly if helpful for Sprint 2 working group review.
Evidence and prior context:
Beta Was this translation helpful? Give feedback.
All reactions