test: add comprehensive test suite with 1000 test cases#3
Open
seedquan wants to merge 2 commits intoiamtouchskyer:mainfrom
Open
test: add comprehensive test suite with 1000 test cases#3seedquan wants to merge 2 commits intoiamtouchskyer:mainfrom
seedquan wants to merge 2 commits intoiamtouchskyer:mainfrom
Conversation
Add 8 test files covering all modules with node:test + node:assert: - eval-parser (300 tests): severity detection, file refs, verdicts, fix lines, reasoning, hedging, findings count, edge cases - flow-commands (300 tests): cmdRoute, cmdInit, cmdValidate, cmdTransition, cmdValidateChain with full state machine coverage - eval-commands (150 tests): cmdVerify, cmdSynthesize, cmdReport, cmdDiff with oscillation detection - viz-commands (50 tests): getMarker, cmdViz, cmdReplayData - flow-templates (50 tests): structure validation, edge completeness - opc-cli (50 tests): version, help, install, uninstall via child process - verify-devil-advocate (50 tests): challenge/verdict parsing, quality checks - integration (50 tests): end-to-end flows, error recovery All 1000 tests pass in ~4 seconds. Zero external dependencies. Run with: node --test tests/*.test.mjs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 7 verification test files with deep boundary, property, and integration testing: - verify-parser-boundaries (200): regex edge cases, encoding stress, fuzzy inputs, large-scale, regression patterns - verify-parser-properties (150): idempotency, count consistency, ordering, verdict/file-ref/severity/hedging invariants - verify-flow-state-machine (200): exhaustive route table, state invariants, limit exhaustion, concurrent state, full traversals - verify-handshake-schema (150): field types, enum boundaries, artifact paths, evidence rules, cross-field, malformed JSON - verify-synthesis-diff (150): verdict logic, role extraction, diff normalization, oscillation thresholds, report generation - verify-viz-replay (50): marker transitions, viz consistency, replay data completeness - verify-e2e-scenarios (100): happy paths, fail loops, oscillation, max limits, devil's advocate, report-replay round-trips Bug found: cmdValidate crashes on null JSON input (V685) All 2000 tests pass in ~4 seconds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
iamtouchskyer
added a commit
that referenced
this pull request
Apr 19, 2026
Addresses 3 ITERATE findings from U1.6r contract + semantics reviewers: 1. fireArtifactEmit: recordSuccess was unconditionally resetting _failStreak after per-item write failures, so circuit-breaker would never trip on persistent write failures. Track anyItemFailed and only call recordSuccess when every item in the call succeeded. (semantics F1, contract #2) 2. fireArtifactEmit: accept ArrayBufferView (Uint8Array, DataView) in addition to string / Buffer. Modern APIs (crypto.subtle, TextEncoder, Playwright) commonly return Uint8Array — tight Buffer.isBuffer check was silently dropping them with a misleading WARN. (semantics F2) 3. cmdExtensionArtifact: add nodeCapabilities to stdout JSON for consistency with cmdExtensionVerdict. (contract #1) 4. CONTRIBUTING.md: document executeRun + artifactEmit hooks with sample skeleton + hook surface summary table. (contract #3) Regression tests: 4 new tests (Uint8Array accepted, _failStreak persists across calls, success reset is all-or-nothing, CLI JSON includes nodeCapabilities). Total 118/118 extension tests, 22/22 suite files green.
iamtouchskyer
added a commit
that referenced
this pull request
Apr 19, 2026
…llow-up) Reviewer B (U2.8d) found two real bugs in the U2.8c JSON sidecar fix: 🔴 #2 dedup key collision: `${ext}|${hook}|${kind}|${message}` doesn't escape `|`. Two genuinely different failures collide silently: A: ext="a|b", hook="c" → "a|b|c|error|msg" B: ext="a", hook="b|c" → "a|b|c|error|msg" Fix: use JSON.stringify on a tuple `[ext,hook,kind,message]` — keys are unambiguous regardless of field contents. 🔴 #5 droppedTotal overwrite: the field name promises accumulation ("droppedTotal") but the code wrote `dropped` from the current call, silently resetting prior cap-overflow signal across CLI invocations. Fix: read priorDropped from sidecar, write `priorDropped + dropped`. Markdown view's "N earlier failure record(s) dropped" message now reflects the lifetime total, not just the last call. Verification: - New unit tests: 6.1 pipe-collision: A and B above both preserved (length=2) 7.1 droppedTotal accumulates 5+3+0 = 8 across three CLI invocations - test-run2-failure-merge.sh: 11/11 pass (was 9/9) - Full suite: 27/27 still pass — no regression Out-of-scope (acknowledged, deferred): - #3 R-M-W race under concurrent CLI lanes sharing runDir: documented single-writer invariant assumption; future work if multi-lane CI lands. - #4 schema drift on top-level unknown fields: per-entry fields already preserved (we spread the whole entry); top-level only carries failures+ droppedTotal so drift surface is bounded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
node:test+node:assert— zero external dependenciesTest Coverage
Run
node --test tests/*.test.mjsTest plan
🤖 Generated with Claude Code