diff --git a/spec/conformance/scenarios/README.md b/spec/conformance/scenarios/README.md index 47aefa3..7a7c886 100644 --- a/spec/conformance/scenarios/README.md +++ b/spec/conformance/scenarios/README.md @@ -36,6 +36,7 @@ Every server consumes the same engine (`smooth-operator-core`), which ships a de - **`server.tools`** — deterministic tools to register on the agent. Each is `{ name, description, parameters, result }`; the tool ignores its arguments and returns the fixed `result` string, so a tool-calling turn is fully reproducible. A `mockLlmScript` `toolCall` entry names one of these; the server dispatches it and streams a `stream_chunk` with `data.state.rawResponse.toolCall` then one with `data.state.rawResponse.toolResult` before the final text. Each server maps this onto its own tool-injection mechanism (a tools list for Python/TS/Go/C#; the `ToolProvider` seam for Rust) — the corpus is identical. - **`server.confirmTools`** — tool-name patterns gated by **write-confirmation HITL**. When the engine calls a matching tool, the server **parks** the turn and emits `write_confirmation_required` (with `data.data.{ toolId, actionDescription }`) instead of running it; the scenario then sends a `confirm_tool_action` frame (`sessionId` + `approved`), the server acks with `immediate_response`(200, `data.approved`), and the parked turn resumes (runs the tool on approve, rejects on deny). The gated tool's `toolCall` chunk is deferred until *after* the confirmation prompt. Canonical order verified against the Rust reference. +- **`server.knowledge`** — docs `{ source, content }` seeded into the server's knowledge base before the turn, so a grounded answer surfaces **citations**. The server mirrors the engine's auto-retrieval (`query(message, 3)`) into `eventual_response`'s `data.data.citations[]` — each `{ id, title, url?, snippet, score }`, present only when non-empty. Assert the deterministic fields (`citations.N.id`/`title`/`snippet`) via array-index paths; **not** `score` (a computed float). Each server seeds its own KB the same way (the runner sets the doc id to its source so `id == title == source` is deterministic). Canonical fields verified against the Rust reference. **`steps[].send`** — one inbound protocol frame. `{{name}}` placeholders are substituted from values `capture`d earlier (e.g. `"sessionId": "{{sessionId}}"`). diff --git a/spec/conformance/scenarios/citations-grounded-turn.json b/spec/conformance/scenarios/citations-grounded-turn.json new file mode 100644 index 0000000..788edad --- /dev/null +++ b/spec/conformance/scenarios/citations-grounded-turn.json @@ -0,0 +1,25 @@ +{ + "name": "citations-grounded-turn", + "description": "When the server's knowledge base is seeded and a turn is grounded on it, the eventual_response must carry data.data.citations with the source(s) that grounded the answer (the engine's retrieval mirror). Rust/TS/C# do this; Go + Python must reach parity. Asserts the deterministic citation fields (id/title/snippet), not the computed score. Canonical order verified against the Rust reference.", + "server": { + "knowledge": [ + { "source": "returns.md", "content": "SmooAI returns are accepted within 30 days of delivery for a full refund." } + ] + }, + "mockLlmScript": [ + { "kind": "text", "text": "Our return window is 30 days." } + ], + "steps": [ + { "send": { "action": "create_conversation_session", "requestId": "r-create", "agentId": "11111111-1111-1111-1111-111111111111", "userName": "Alice", "userEmail": "alice@example.com" }, "expect": [ { "type": "immediate_response", "status": 200, "capture": { "sessionId": "data.sessionId" } } ] }, + { "send": { "action": "send_message", "requestId": "r-msg", "sessionId": "{{sessionId}}", "message": "what is the return policy?", "stream": true }, "expect": [ + { "type": "immediate_response", "status": 202 }, + { "type": "stream_token", "repeat": true, "accumulate": "token", "assertAccumulated": "Our return window is 30 days." }, + { "type": "eventual_response", "status": 200, "assert": { + "data.data.citations.0.id": "returns.md", + "data.data.citations.0.title": "returns.md", + "data.data.citations.0.snippet": "SmooAI returns are accepted within 30 days of delivery for a full refund.", + "data.data.response.responseParts": ["Our return window is 30 days."] + } } + ] } + ] +}