diff --git a/experiments/rule-engine-poc/docs/README.md b/experiments/rule-engine-poc/docs/README.md index a7c1b6e24..168dfd941 100644 --- a/experiments/rule-engine-poc/docs/README.md +++ b/experiments/rule-engine-poc/docs/README.md @@ -13,6 +13,7 @@ Detailed documentation for the POC. Start with the project [README](../README.md |---|---| | [`architecture.md`](architecture.md) | You want the full architecture picture — system map, user flow, data flow, engine internals, module graph, with Mermaid diagrams. **Start here.** | | [`workflow.md`](workflow.md) | You want to run the `plan` → paste → `validate` → `report` loop end-to-end. | +| [`report-reference.md`](report-reference.md) | You want a single-page overview of the HTML report — section-by-section walkthrough, the five perspectives that shaped it, what's still open. | | [`dsl-reference.md`](dsl-reference.md) | You're writing or reading a rule file and need the full YAML grammar — every operator, every grouping construct. | | [`audit-trail.md`](audit-trail.md) | You need to replay a verdict, diff two verdicts, or map the audit trail to EU AI Act / ISO 42001 requirements. | | [`compliance.md`](compliance.md) | You're scoping the pattern against a regulation or standard — what the POC ticks natively, what the adopter must still provide, what's out of scope. | diff --git a/experiments/rule-engine-poc/docs/report-reference.md b/experiments/rule-engine-poc/docs/report-reference.md new file mode 100644 index 000000000..f3e2919cd --- /dev/null +++ b/experiments/rule-engine-poc/docs/report-reference.md @@ -0,0 +1,170 @@ +--- +title: Report reference +folder: experiments/rule-engine-poc/docs +description: A single-page overview of the HTML report — what each section is for, what the five product perspectives that shaped it found, what's implemented, and what's still open. +entry_point: false +--- + +# Report reference + +This is the meta-doc for the HTML report itself — the user-facing artifact of the POC. It consolidates what `architecture.md`, `workflow.md`, `audit-trail.md`, and the five wave-4 research artifacts (`research/17`–`21`) say about the report into one place. + +## Contents + +1. [What the report is](#1-what-the-report-is) +2. [Section-by-section walkthrough](#2-section-by-section-walkthrough) +3. [The five product perspectives that shaped it](#3-the-five-product-perspectives-that-shaped-it) +4. [What got implemented (wave-4 delta)](#4-what-got-implemented-wave-4-delta) +5. [What's still open](#5-whats-still-open) +6. [Generating one](#6-generating-one) +7. [Reading one](#7-reading-one) + +--- + +## 1. What the report is + +A self-contained HTML file rendered by `src/html-report.ts` from a `VerdictResult`. One report per target; one file per `npm run report` invocation. Inline CSS, no JavaScript, no external assets — it survives email forwarding, Slack attachment, S3 retention, and offline viewing. + +The report is the *only* document most readers will see. The terminal output and the JSON `--json` mode are for CI and operators; the HTML is for everyone else (PR reviewer, PM, EM, QA, compliance officer, auditor, the author a week later). + +Three committed samples under [`research/sample-reports/`](../research/sample-reports/) show the three primary verdict shapes — `blocked`, `needs-attention`, `ready-to-progress`. + +## 2. Section-by-section walkthrough + +The report renders top-to-bottom in this order. The order matters: it follows the reader's actual scan path established by `research/17` (UX audit) and `research/20` (auditor reading path). + +| Section | Job | Section header at render | Source | +|---|---|---|---| +| **System-identity header** | Tell a cold reader *what this is* — engine version + prominent timestamp | (no header — runs above the verdict card) | `research/20` Art. 13 "provider identity" gap | +| **Verdict tile** | Categorical tier in 2-second-scan colour: blocked / needs-attention / ready-to-progress / unknown | (the headline) | engine `verdict` | +| **Stats line** | "N rule(s) fired · M action(s) to take" — quantifies how contested the decision is | (under the tile) | `result.evaluations`, `result.actions` | +| **Blocker-by-absence banner** | "X rules could not be evaluated because the LLM did not supply Y, Z, W" — yellow, adjacent to verdict | (conditional banner) | `research/21` skim-trap finding | +| **Skip-validate banner** | "WARNING: validation gate was skipped" — when `--skip-validate` was set | (conditional banner) | `research/18` + `research/21` trust calibration | +| **Verdict-tier + glyph legend** | Collapsed `
` explaining blocked / needs-attention / ready-to-progress / unknown + `[+] / [-] / [?]` glyph meanings | "Glossary." | `research/17` + `research/20` | +| **Weighted tally** | Per-tier weight totals, side-by-side with actions | "Weighted tally." | engine `weightedTally` | +| **Suggested actions** | Imperative-voice action sentences in priority-of-cause order (not alphabetical) | "Take these actions." | engine `evaluations` walked in priority desc; `rules/action-glossary.yaml` if present | +| **Extraction flags** | The LLM's structured output as a table | "Extraction flags." | `ctx.flags` | +| **What fired** | Matched rules only, in priority order, each with rule id + description + flags it matched on + actions it contributed | "What fired." | `result.evaluations` filtered to `matched === true` | +| **Audit trail** | Every rule evaluation, matched + skipped. Skipped rules collapse to `
` summary by default | "Audit trail." | `result.evaluations` | +| **Reproduce block** | Shell-quoted command + the three replay anchors (engine version + ruleset hash + flags hash) | "How to reproduce." | `research/20` Art. 12 replay manifest | +| **Provenance** | Hash preamble + 12-char truncated hashes + file paths | "Provenance." | `result.rulesetHash`, `result.flagsHash`, `result.engineVersion` | + +The CSS is inline and uses a 3-tier severity palette (red / amber / green) with non-colour-only signals (glyphs, row washes, section headings). A `@media (max-width: 540px)` rule collapses the summary grid to a single column on phones. + +## 3. The five product perspectives that shaped it + +Wave 4 dispatched five subagents in parallel against three committed sample renders. Each reviewed the report through a different lens. The convergent findings drove the v3 rebuild in the wave-4 implementer pass. + +### UX (`research/17-report-ux-audit.md`) + +> "The audit trail buries what matters. With ~21 rules and 1–2 matches per sample, the page is ~95% 'did not match' content." + +Top recommendations: a **What fired** section above the full audit trail; collapse-by-default for skipped rules; replace "X of 21 rules matched" coverage text with per-tier-appropriate phrasing; sort actions by priority-of-cause not alphabetical; a `cond--miss` row-wash to match `cond--missing`'s amber; a `@media (max-width: 540px)` single-column fallback. + +### Stakeholder strategy (`research/18-report-stakeholders.md`) + +> "The report is one artifact serving six different first-fields — PR-reviewers want the verdict pill, compliance wants the provenance block, PMs want suggested actions, auditors want the hashes, authors want the flags." + +Top recommendations: expand action slugs (`kick-ci`) to human sentences (`"Re-run the failing CI job."`) via an `actions[].human` field — implemented as a sidecar glossary in `rules/action-glossary.yaml`; introduce a `label_set` config (default `dev`; `pm`, `qa`, `compliance` as presentational overrides) so headline labels match the reader; flag a portfolio dashboard as the first hosted-SaaS gravity seam — *do not build it yet*. + +### Brand (`research/19-report-brand-review.md`) + +Verdict: **pass-with-findings** (not S1-blocking under the sandbox scope). On-temperament (no emoji, no gradients, no icons, restrained density, ASCII `[+]/[-]/[?]` glyphs correctly used as monospace-iconography). Off-token: 18 distinct literal hex values, hand-picked font stacks, near-white page background where Specorator calls for cream. Section headers should be sentence-case with periods — implemented. Open ADR-shaped decision: Specorator has no red token; the `blocked` tier currently uses literal `#fdecea / #d8281b / #7a160d` and stays that way until graduation. + +### Auditor readability (`research/20-report-auditor-readability.md`) + +> "30-second test passes... what the 30-second test fails on: the report does not name what kind of system this is, who built it, what version of the workflow it governs, or what the verdict is binding against." + +Implemented: system-identity header (engine version + prominent timestamp), reproduce-block with the three replay anchors, verdict-tier glossary, glyph legend. Still open: provider identity / contact, model-card link, capabilities-and-limitations sentence, expected-lifetime statement, reviewer-of-record field. Closes `research/02`'s "human-readable rationale presentation" open item; the remainder is governance, not engineering. + +### Misread risks (`research/21-report-misread-risks.md`) + +Three flagged misread paths: + +1. **The skim trap** — a busy reader looks only at the verdict tile and action list, missing context. Implemented mitigation: blocker-by-absence banner is rendered at the same visual weight as the verdict card. +2. **`verified` badge as trust trap** — a green pill saying `verified` will read as "extraction verified" when it only means "extraction is bound to current inputs". Implemented mitigation: tooltip on the badge explicitly says *bound to current inputs, not flag-correctness*. Plus the `--skip-validate` banner is now visually equal-weight to the verdict so a skipped-validation run cannot pass undetected. +3. **Blocker-by-absence is the most dangerous skim path** — a high-priority blocker whose input flag was never extracted simply doesn't fire. Implemented mitigation: dedicated banner naming the missing flags AND the count of un-evaluable rules (counted using `matched === false`, fixed in Codex round 12 to exclude rules that fired via `when.any` despite one missing branch). + +## 4. What got implemented (wave-4 delta) + +The 12 changes that landed across the wave-4 implementer pass (Agent B's RALPH loop) and the subsequent Codex round 11–14 hardening: + +| # | Change | Research source | +|---|---|---| +| 1 | "What fired" section above audit trail | `17`, `20`, `21` | +| 2 | Non-matched rules collapsed via `
` | `17` | +| 3 | Blocker-by-absence banner naming missing flags | `21`, `17` | +| 4 | Suggested actions in priority-of-cause order | `17` | +| 5 | Action human-sentences from glossary | `18` | +| 6 | Provenance preamble + reproduce block + 12-char hash truncation | `17`, `20`, `18` | +| 7 | System-identity header + prominent timestamp | `20` | +| 8 | Verdict-tier + glyph legend in collapsed `
` | `20`, `17` | +| 9 | `cond--miss` row wash matching `cond--missing` amber | `17` | +| 10 | `@media (max-width: 540px)` single-column fallback | `17` | +| 11 | `--skip-validate` banner + `verified` badge tooltip | `18`, `21` | +| 12 | Sentence-case headers + imperative voice ("Take these actions.") | `19` | + +Plus three follow-up Codex hardenings that were caught after the wave-4 push: + +- **Round 11** (`90f3fe1`) — `openInBrowser` waits for the spawned process to exit cleanly (not just `spawn`); `takeOpt` rejects missing values for `--config` / `--target`. +- **Round 12** (`eb01077`) — `missingFlagNames` only counts rules whose final outcome was determined by absence (excludes `when.any` rules that matched another branch); reproduce-command paths are shell-quoted. +- **Round 13** (`003a05e`) — single-shot `cli.ts::takeOption` rejects missing `--html` values. + +Test surface for the report: 28 tests in `test/html-report.test.ts` plus the report-flow integration tests in `test/report-flow.test.ts`. + +## 5. What's still open + +Deferred, with the bucket each lives in: + +| Item | Source | Bucket | +|---|---|---| +| `label_set` config (dev / pm / qa / compliance presets) | `research/18` | Strategy slice 14–18 | +| Reader-specific export modes (PDF, markdown, Slack-friendly text) | `research/18` | Strategy slice | +| Portfolio dashboard (one HTML across N targets) | `research/18` | Hosted-SaaS gravity seam — explicitly *do not build* per strategist | +| Provider identity / model card / capabilities-and-limitations sentence | `research/20` | Governance (compliance.md "what's not in this POC") | +| Reviewer-of-record field, override workflow | `research/20` | Governance | +| Brand-token migration (replace 18 literal hex values with vars) | `research/19` | ADR at graduation | +| Diff-against-previous-run | `research/21` | Production prep | +| Confidence / uncertainty surface | `research/20` + `research/21` | Calibration study first | +| RAT-A / RAT-B / RAT-C / RAT-D / RAT-E / RAT-F | `research/07`, `research/14` | Discovery activity — needs users, not more engineering | + +## 6. Generating one + +```bash +# Plan (writes the prompt + sidecar) +npm run plan -- --target + +# User pastes the prompt into Claude / ChatGPT / Gemini, saves JSON to +# extractions/.json. + +# Validate (optional sanity check) +npm run validate -- --target + +# Report (renders HTML, opens browser best-effort) +npm run report -- --target +# Or without opening a browser: +npm run report -- --target --no-open +``` + +Exit codes: `0` no blockers, `1` at least one `blocked` verdict, `2` missing / malformed extraction. + +For testing without the AI loop (single-shot fixture flow): + +```bash +npx tsx src/cli.ts rules/quality-gates.yaml fixtures/blocked-missing-ears.json --html /tmp/preview.html +``` + +## 7. Reading one + +For the **report consumer** (not the POC operator): + +1. **Verdict tile** — colour and label. That's the answer. +2. **Stats line** — how many rules fired? How many actions to take? +3. **Banners** (if present) — blocker-by-absence flags missing inputs; skip-validate flag means the validation gate was bypassed (treat the verdict as advisory). +4. **Take these actions** — the imperative-voice human sentences; if `[code](#)` slug is shown, hover for the technical name. +5. **What fired** — the rules that drove the verdict, in priority-of-cause order. Each is a 1-paragraph card; expand the audit trail for everything that *didn't* fire. +6. **Provenance** — the three hashes (engine version, ruleset, flags) let you verify the report came from a specific tuple. The reproduce-command block lets you re-run it locally. + +A reader who reads only steps 1–4 should still get the answer correct. The rest is depth on demand. + +See [`docs/audit-trail.md`](audit-trail.md) for replay mechanics and [`docs/compliance.md`](compliance.md) for which sections speak to which regulation. diff --git a/experiments/rule-engine-poc/scripts/run-all-fixtures.mjs b/experiments/rule-engine-poc/scripts/run-all-fixtures.mjs index 50412f68d..11e09fe0c 100644 --- a/experiments/rule-engine-poc/scripts/run-all-fixtures.mjs +++ b/experiments/rule-engine-poc/scripts/run-all-fixtures.mjs @@ -1,8 +1,10 @@ import { readdirSync } from "node:fs"; import { join } from "node:path"; +import { fileURLToPath } from "node:url"; import { spawnSync } from "node:child_process"; -const fixturesDir = new URL("../fixtures/", import.meta.url).pathname; +// fileURLToPath handles percent-decoding and Windows drive paths. +const fixturesDir = fileURLToPath(new URL("../fixtures/", import.meta.url)); const rules = "rules/quality-gates.yaml"; const files = readdirSync(fixturesDir) diff --git a/experiments/rule-engine-poc/scripts/run-all-html.mjs b/experiments/rule-engine-poc/scripts/run-all-html.mjs index 551bd3918..be2dd412f 100644 --- a/experiments/rule-engine-poc/scripts/run-all-html.mjs +++ b/experiments/rule-engine-poc/scripts/run-all-html.mjs @@ -1,8 +1,11 @@ import { readdirSync, mkdirSync } from "node:fs"; import { join, basename } from "node:path"; +import { fileURLToPath } from "node:url"; import { spawnSync } from "node:child_process"; -const fixturesDir = new URL("../fixtures/", import.meta.url).pathname; +// fileURLToPath handles percent-decoding and Windows drive paths; +// `.pathname` alone breaks for both (Codex round 15 P2). +const fixturesDir = fileURLToPath(new URL("../fixtures/", import.meta.url)); const rules = "rules/quality-gates.yaml"; const reportsDir = "reports"; diff --git a/experiments/rule-engine-poc/src/html-report.ts b/experiments/rule-engine-poc/src/html-report.ts index 73346291b..8a7ad95d1 100644 --- a/experiments/rule-engine-poc/src/html-report.ts +++ b/experiments/rule-engine-poc/src/html-report.ts @@ -268,13 +268,24 @@ export function renderHtmlReport( ? `` : ""; - // Reproduce command: assembled from the same fields plan/report use. - // Codex round 12 P2: quote paths so paths with spaces or shell - // metacharacters (e.g., "My Projects/rules.yaml") don't break the - // command. Single-quote shell-escape: replace any ' inside the - // path with the four-char sequence '\'' . - const shellQuote = (s: string): string => `'${s.replace(/'/g, "'\\''")}'`; - const reproCmd = `npx tsx src/cli.ts ${shellQuote(ctx.rulesPath)} ${shellQuote(ctx.flagsPath)} --html --quiet`; + // Reproduce command: render three flavours because the supported + // shells disagree on quoting AND on variable expansion: + // - POSIX (bash / zsh / sh): single-quote escape (' becomes '\''). + // Single quotes suppress $var expansion. + // - cmd.exe: double-quote escape (" becomes ""). cmd doesn't + // expand $var; it expands %VAR%, but our paths don't carry %. + // - PowerShell: single-quote escape (' becomes ''). Double quotes + // in PowerShell EXPAND $var and $(), which can mutate the path + // (Codex round 19 P2). + const posixQuote = (s: string): string => `'${s.replace(/'/g, "'\\''")}'`; + const cmdQuote = (s: string): string => `"${s.replace(/"/g, '""')}"`; + const psQuote = (s: string): string => `'${s.replace(/'/g, "''")}'`; + // Use a literal filename, NOT ``: angle brackets are shell + // I/O redirection on all three, so a copy-paste would silently send + // --html no value (Codex round 18 P2). + const reproCmdPosix = `npx tsx src/cli.ts ${posixQuote(ctx.rulesPath)} ${posixQuote(ctx.flagsPath)} --html out.html --quiet`; + const reproCmdCmd = `npx tsx src/cli.ts ${cmdQuote(ctx.rulesPath)} ${cmdQuote(ctx.flagsPath)} --html out.html --quiet`; + const reproCmdPwsh = `npx tsx src/cli.ts ${psQuote(ctx.rulesPath)} ${psQuote(ctx.flagsPath)} --html out.html --quiet`; return ` @@ -491,7 +502,12 @@ export function renderHtmlReport(

How to reproduce — run from experiments/rule-engine-poc/:

-
${esc(reproCmd)}
+

POSIX (macOS, Linux, WSL, Git Bash):

+
${esc(reproCmdPosix)}
+

Windows cmd.exe:

+
${esc(reproCmdCmd)}
+

PowerShell (Windows / cross-platform):

+
${esc(reproCmdPwsh)}

Then verify the three hashes above match the values in the regenerated report.

diff --git a/experiments/rule-engine-poc/src/loader.ts b/experiments/rule-engine-poc/src/loader.ts index 7de40ced2..a2454469f 100644 --- a/experiments/rule-engine-poc/src/loader.ts +++ b/experiments/rule-engine-poc/src/loader.ts @@ -137,6 +137,15 @@ function validate( if (typeof rule.priority !== "number") { throw new Error(`Rule '${rule.id}' missing numeric 'priority'`); } + // Codex round 15 P2: NaN/Infinity priorities silently break the + // documented sort order (b.priority - a.priority returns NaN, treated + // as 0), reordering the audit trail unpredictably. Same fail-fast + // discipline as weight + gt + lt. + if (!Number.isFinite(rule.priority)) { + throw new Error( + `Rule '${rule.id}' has non-finite 'priority' (got ${String(rule.priority)})`, + ); + } } const CONDITION_OPS = [ diff --git a/experiments/rule-engine-poc/src/validate.ts b/experiments/rule-engine-poc/src/validate.ts index 291fee8fb..85aee8fbd 100644 --- a/experiments/rule-engine-poc/src/validate.ts +++ b/experiments/rule-engine-poc/src/validate.ts @@ -104,13 +104,20 @@ export function validateExtraction( continue; } if (value === null) { - warnings.push({ - severity: "warning", - code: "null-value-omit-instead", + // Codex round 16 P1: previously a warning, but the engine's + // `hasOwnProperty` presence check treats null as PRESENT — so + // `exists`/`ne` rules behave differently against {flag: null} + // than against {} despite the validator's old "null ≈ missing" + // claim. The right fix is to refuse null at the gate so the + // engine never sees it; LLMs are instructed to omit unknowns. + errors.push({ + severity: "error", + code: "null-value-not-allowed", path: key, message: - `Flag '${key}' is null; prefer omitting unknowns over emitting null. ` + - `The engine will treat null and missing identically.`, + `Flag '${key}' is null; omit the field instead. ` + + `The engine's presence check treats null as PRESENT, which can ` + + `silently change verdicts for rules using 'exists' or 'ne'.`, }); continue; } diff --git a/experiments/rule-engine-poc/test/html-report.test.ts b/experiments/rule-engine-poc/test/html-report.test.ts index 66cad50ac..22c33b2c3 100644 --- a/experiments/rule-engine-poc/test/html-report.test.ts +++ b/experiments/rule-engine-poc/test/html-report.test.ts @@ -287,8 +287,8 @@ describe("renderHtmlReport: provenance reframing", () => { }); it("shell-quotes paths in the reproduce command so spaces don't break it", () => { - // Codex round 12 P2: unquoted paths break copy-pasted reproduce - // commands on user machines (e.g., "My Projects/..."). + // Codex rounds 12/17/19: render three forms — POSIX, cmd.exe, + // PowerShell — because each uses different quoting + expansion. const flags: ExtractionFlags = { ci_failing: true }; const result = evaluate(rules, flags); const html = renderHtmlReport( @@ -299,10 +299,38 @@ describe("renderHtmlReport: provenance reframing", () => { flagsPath: "extractions with spaces/x.json", }), ); - // Single quotes are HTML-escaped (') in the rendered output, - // but they decode back to ' when the user pastes the command. + // POSIX + PowerShell: single quotes (HTML-escaped to '). expect(html).toContain("'My Projects/rules.yaml'"); expect(html).toContain("'extractions with spaces/x.json'"); + // cmd.exe: double quotes (HTML-escaped to "). + expect(html).toContain(""My Projects/rules.yaml""); + expect(html).toContain(""extractions with spaces/x.json""); + // All three labels appear. + expect(html).toContain("POSIX"); + expect(html).toContain("cmd.exe"); + expect(html).toContain("PowerShell"); + }); + + it("PowerShell repro form uses single quotes so $var stays literal", () => { + // Codex round 19 P2: PowerShell double-quoted strings expand + // $var / $(...) — single quotes suppress that. The PowerShell + // labelled block must use single quotes so a path like + // 'src/$something/x.json' is reproduced literally. + const flags: ExtractionFlags = { ci_failing: true }; + const result = evaluate(rules, flags); + const html = renderHtmlReport( + result, + baseCtx({ + flags, + rulesPath: "My$Projects/rules.yaml", + flagsPath: "src/$something/x.json", + }), + ); + const psStart = html.indexOf("PowerShell"); + expect(psStart).toBeGreaterThan(-1); + const psBlock = html.slice(psStart, psStart + 800); + expect(psBlock).toContain("'My$Projects/rules.yaml'"); + expect(psBlock).toContain("'src/$something/x.json'"); }); it("truncates ruleset and flags hashes to 12-char prefixes", () => { diff --git a/experiments/rule-engine-poc/test/loader.test.ts b/experiments/rule-engine-poc/test/loader.test.ts index 54bcc78c2..445392e94 100644 --- a/experiments/rule-engine-poc/test/loader.test.ts +++ b/experiments/rule-engine-poc/test/loader.test.ts @@ -463,6 +463,48 @@ describe("loader", () => { ).toThrow(/non-finite 'lt'/); }); + it("rejects rules with non-finite priority", () => { + expect(() => + loadRulesFromString( + ` +- id: r1 + priority: .nan + description: x + when: + all: + - flag: a + eq: true + then: + verdict: blocked + weight: 1 + actions: [a] +`, + "priority-nan", + ), + ).toThrow(/non-finite 'priority'/); + }); + + it("rejects rules with infinite priority", () => { + expect(() => + loadRulesFromString( + ` +- id: r1 + priority: .inf + description: x + when: + all: + - flag: a + eq: true + then: + verdict: blocked + weight: 1 + actions: [a] +`, + "priority-inf", + ), + ).toThrow(/non-finite 'priority'/); + }); + it("assigns a stable content hash to each rule", () => { const ruleA = loadRulesFromString( ` diff --git a/experiments/rule-engine-poc/test/validate.test.ts b/experiments/rule-engine-poc/test/validate.test.ts index 824f3593a..f4a7445fa 100644 --- a/experiments/rule-engine-poc/test/validate.test.ts +++ b/experiments/rule-engine-poc/test/validate.test.ts @@ -105,12 +105,11 @@ describe("validateExtraction", () => { expect(r.errors[0]!.code).toBe("disallowed-value"); }); - it("warns (but does not error) when a flag value is null", () => { + it("errors when a flag value is null (engine treats null as present)", () => { const r = validateExtraction({ ci_passing: null }, schema); - expect(r.ok).toBe(true); - expect(r.warnings).toHaveLength(1); - expect(r.warnings[0]!.code).toBe("null-value-omit-instead"); - expect(r.warnings[0]!.path).toBe("ci_passing"); + expect(r.ok).toBe(false); + expect(r.errors[0]!.code).toBe("null-value-not-allowed"); + expect(r.errors[0]!.path).toBe("ci_passing"); }); describe("with expectedPromptHash", () => {