Symptom
Text-only personas (e.g. CodeReview AI without a vision model) hallucinate "absence of images or attachments" when an image IS attached to the user's message. Empirical hit on PR #950 / Linux/CUDA at HEAD 056978c (Anvil flagged 2026-04-25 04:03Z).
Root cause (two-stage)
Stage 1 — TS side, PersonaResponseGenerator.ts:380-391
let description: string | undefined;
if (m.type === 'image') {
try {
const visionSvc = VisionDescriptionService.getInstance();
if (visionSvc.descriptionStatus(base64) === 'cached') {
const desc = await visionSvc.describeBase64(base64, m.mimeType ?? 'image/png', { maxLength: 200 });
description = desc?.description;
}
} catch {
// Best-effort; drop to undefined on any cache error
}
}
description is only populated when VDS reports 'cached'. On the first message with a fresh image (cache cold, pre-warm in-flight), status is 'inflight' not 'cached', so description stays undefined. The signal payload then carries { itemType, base64, mimeType, description: undefined } to Rust.
Stage 2 — Rust side, cognition::respond / signal → ContentPart conversion
When the resolved persona model is text-only AND signal.media[i].description is undefined, the image is silently dropped from the user-role message. Result: the model sees the message as if no image existed, and CONFIDENTLY narrates "I don't see any attachment" — fail-silent fallback, the exact pattern memory_two_ironclad_rules calls out as illegal.
Proposed fix
Stage 1 (TS)
Replace the cached-only check with a bounded await — VDS already deduplicates in-flight requests, so the wait is short for already-pre-warmed images:
if (m.type === 'image') {
try {
const visionSvc = VisionDescriptionService.getInstance();
const status = visionSvc.descriptionStatus(base64);
if (status === 'cached' || status === 'inflight') {
// Bounded wait — pre-warm started at chat-send, usually ready by now.
// 8s caps worst-case for a fresh first-image scenario.
const desc = await Promise.race([
visionSvc.describeBase64(base64, m.mimeType ?? 'image/png', { maxLength: 200 }),
new Promise<null>((resolve) => setTimeout(() => resolve(null), 8000)),
]);
description = desc?.description;
}
} catch {
// Best-effort; drop to undefined on any cache error
}
}
Stage 2 (Rust)
When converting signal.media[i] to ContentPart for a text-only model:
- If
description is Some(d) → inject [Attached image: {d}] as a text part on the user-role message.
- If
description is None → inject [Attached image: vision description unavailable — {mime}, {len} bytes] (FAIL LOUD per memory_two_ironclad_rules; never silently drop).
This makes the persona either see the image (vision-capable), see the description (text-only with VDS), or know an image was attached (text-only without VDS) — three deterministic outcomes, zero silent drops.
Acceptance
- Empirical: send an image to a text-only persona on the first message after a fresh start. Persona MUST acknowledge the image (either via description or via "image attached but I cannot describe it"). Persona MUST NOT say "I don't see any attachment."
- Telemetry: a counter for
vds_description_unavailable_marker_emitted so we can see how often the fallback marker fires vs the real description.
Files
src/system/user/server/modules/PersonaResponseGenerator.ts:380-391 — stage 1 TS fix.
workers/continuum-core/src/persona/respond.rs (or wherever signal.media → ContentPart conversion lives) — stage 2 Rust fix.
Severity
Persona-correctness regression. Not a #950-introduced regression — pre-existing per Anvil's triage. Filing as follow-up so it lands cleanly post-merge.
Symptom
Text-only personas (e.g. CodeReview AI without a vision model) hallucinate "absence of images or attachments" when an image IS attached to the user's message. Empirical hit on PR #950 / Linux/CUDA at HEAD 056978c (Anvil flagged 2026-04-25 04:03Z).
Root cause (two-stage)
Stage 1 — TS side, PersonaResponseGenerator.ts:380-391
descriptionis only populated when VDS reports'cached'. On the first message with a fresh image (cache cold, pre-warm in-flight), status is'inflight'not'cached', so description staysundefined. The signal payload then carries{ itemType, base64, mimeType, description: undefined }to Rust.Stage 2 — Rust side, cognition::respond / signal → ContentPart conversion
When the resolved persona model is text-only AND
signal.media[i].descriptionis undefined, the image is silently dropped from the user-role message. Result: the model sees the message as if no image existed, and CONFIDENTLY narrates "I don't see any attachment" — fail-silent fallback, the exact pattern memory_two_ironclad_rules calls out as illegal.Proposed fix
Stage 1 (TS)
Replace the cached-only check with a bounded await — VDS already deduplicates in-flight requests, so the wait is short for already-pre-warmed images:
Stage 2 (Rust)
When converting
signal.media[i]to ContentPart for a text-only model:descriptionisSome(d)→ inject[Attached image: {d}]as a text part on the user-role message.descriptionisNone→ inject[Attached image: vision description unavailable — {mime}, {len} bytes](FAIL LOUD per memory_two_ironclad_rules; never silently drop).This makes the persona either see the image (vision-capable), see the description (text-only with VDS), or know an image was attached (text-only without VDS) — three deterministic outcomes, zero silent drops.
Acceptance
vds_description_unavailable_marker_emittedso we can see how often the fallback marker fires vs the real description.Files
src/system/user/server/modules/PersonaResponseGenerator.ts:380-391— stage 1 TS fix.workers/continuum-core/src/persona/respond.rs(or wherever signal.media → ContentPart conversion lives) — stage 2 Rust fix.Severity
Persona-correctness regression. Not a #950-introduced regression — pre-existing per Anvil's triage. Filing as follow-up so it lands cleanly post-merge.