Skip to content

Add Holmes law evidence validation gate#536

Merged
flyingrobots merged 2 commits into
mainfrom
holmes-validation-gate-slices-11-15
May 27, 2026
Merged

Add Holmes law evidence validation gate#536
flyingrobots merged 2 commits into
mainfrom
holmes-validation-gate-slices-11-15

Conversation

@flyingrobots
Copy link
Copy Markdown
Owner

Summary

  • Adds the first Rust Holmes law evidence validation gate for HIMP-011 through HIMP-015.
  • Validates bundle structure, required versus optional artifact refs, canonical provenance hashes, artifact-local schema versions, artifact sha256 anchors, and normalized duplicate artifact paths.
  • Adds deterministic artifact availability, unreadable, and oversized diagnostics through application-layer ports while keeping domain validation pure.
  • Updates BEARING, README, and CHANGELOG for the 15/90 implementation checkpoint.

Tests

  • cargo test -p wesley-holmes
  • git diff --check
  • pnpm run preflight
  • pre-push Rust product preflight

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Warning

Review limit reached

@flyingrobots, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 6 minutes and 16 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 27d18db4-4aef-4f6a-acf9-4e8d81f3dbce

📥 Commits

Reviewing files that changed from the base of the PR and between c0e3b16 and 69391b2.

📒 Files selected for processing (12)
  • CHANGELOG.md
  • crates/wesley-holmes/README.md
  • crates/wesley-holmes/src/application/evidence_validation.rs
  • crates/wesley-holmes/src/application/mod.rs
  • crates/wesley-holmes/src/domain/diagnostic.rs
  • crates/wesley-holmes/src/domain/evidence.rs
  • crates/wesley-holmes/src/domain/mod.rs
  • crates/wesley-holmes/src/domain/versioning.rs
  • crates/wesley-holmes/src/lib.rs
  • crates/wesley-holmes/src/ports/mod.rs
  • crates/wesley-holmes/tests/foundation.rs
  • docs/BEARING.md
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch holmes-validation-gate-slices-11-15

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

🔍 The Case of Pull Request #536

Plain-English Readout

  • Holmes (evidence investigation): Holmes says this change looks ready to ship.
  • Watson (independent verification): Watson found verification concerns. Most important concern: No evidence citations were available for trust analysis.
  • Moriarty (trend forecast): Moriarty does not have enough historical data yet to forecast readiness.

Suggested next actions

  1. Resolve Watson’s verification concerns before trusting the Holmes verdict as final.
📚 Glossary (what the Holmes terms mean)
  • HOLMES: Wesley’s main evidence investigation. It decides whether the cited proof is strong enough to justify shipping this commit.
  • WATSON: An independent verification pass. It checks Holmes’s citations and score math instead of trusting them blindly.
  • MORIARTY: A readiness forecast over time. It is advisory trend analysis, not the release gate itself.
  • Schema coverage score (SCS): How much of the schema has direct supporting evidence across generated artifacts and cited proof.
  • Test confidence index (TCI): How much test evidence exists for constraints, policies, relationships, and operations.
  • Migration risk index (MRI): How risky the schema change is to roll out. Lower is better.
  • Evidence trust: Whether the report is backed by exact citations, whole-file citations, or coarse references. Weak trust means the claim may be directionally right but not specific enough to trust blindly.
  • Citation quality: A count of exact line-span citations versus whole-file or coarse references.
  • ELEMENTARY: Ready to ship based on the current evidence.
  • REQUIRES INVESTIGATION: More work or review is needed before shipping.
  • YOU SHALL NOT PASS: Do not ship this change in its current state.

🕵️ SHA-lock HOLMES full report (click to expand)

🕵️ SHA-lock HOLMES Investigation

  • Generated: 2026-01-01T00:00:00.000Z
  • Commit SHA: ff8b73b
  • Bundle Version: 2.0.0

⚠️ Evidence valid only for commit ff8b73b

🔍 Executive Deduction

"Watson, after careful examination of the evidence, I deduce..."

Weighted Completion: ██████████ 95.0%
Scores: SCS 95.0% · TCI 90.0% · MRI 10.0%
Verification Status: 2 claims verified
Citation Quality: 2 exact · 0 whole-file · 0 coarse
Evidence Trust: strong
Ship Verdict: ELEMENTARY

🧩 SCS Breakdown

Component Score Coverage
Sql 100.0% 1.00/1.00
Types 100.0% 1.00/1.00
Validation 100.0% 1.00/1.00
Tests 100.0% 1.00/1.00

🧪 TCI Breakdown

Component Score Coverage Note
Unit Constraints 100.0% 1/1 N/A
Unit Rls 100.0% 1/1 N/A
Integration Relations 100.0% 1/1 N/A
E2e Ops 90.0% 9/10 fixture

⚠️ MRI Breakdown

Component Risk Share Points Count
Drops 0.0% 0 0
Renames Without Uid 0.0% 0 0
Add Not Null Without Default 100.0% 1 1
Non Concurrent Indexes 0.0% 0 0

📊 The Weight of Evidence

"Observe, Watson, how not all features carry equal importance..."

Element Weight Status Evidence Strength Deduction
schema 5 ✅ Exact SQL & tests test/fixtures/examples/.wesley-cache/shipme-fixture/tests.sql:1-1@ff8b73b exact Elementary!

🚪 Security & Performance Gates

"Elementary security measures, Watson..."

Gate Status Evidence Holmes's Ruling
Migration Risk MRI: 10.0% "Trivial risk"
Test Coverage TCI: 90.0% "Excellent coverage"
Sensitive Fields 0 fields "All secured"
Evidence Quality 2 exact · 0 whole-file · 0 coarse "All 2 citations resolve to exact line spans."

📋 The Verdict

ELEMENTARY - Ship immediately!
"The evidence is conclusive. No mysteries remain."

Signed and sealed,

  • S. Holmes, Consulting Detective

[END OF INVESTIGATION FOR COMMIT ff8b73b]

🧵 Command Run

  • Run ID: run-b4b762b2-8de2-4e88-a746-25549c6b6022
  • Transmutation: holmes-investigate
  • Command: investigate
  • Status: completed
  • Ledger: /home/runner/work/wesley/wesley/test/fixtures/examples/.wesley-cache/ledger

🩺 Dr. WATSON full report (click to expand)

🩺 Dr. Watson's Independent Verification Report

Medical Examination of Evidence

  • Examination Date: 2026-05-27T02:17:54.983Z
  • Patient SHA: ff8b73b

🔬 Citation Verification

"Let me examine each piece of evidence independently..."

  • Citations Examined: 2
  • Verified: 0 ✅
  • Failed: 0 ❌
  • Unable to Verify: 2
  • Exact Subrange Citations: 0
  • Whole-file Citations: 0
  • Coarse Citations: 0
  • Evidence Trust: missing
  • Trust Note: No evidence citations were available for trust analysis.

Verification Rate: 0.0%

📊 Mathematical Verification

"I shall recalculate Holmes's arithmetic..."

Holmes claimed SCS: 95.0%
Watson calculates: 100.0%
Difference: ⚠️ Significant

🔍 Consistency Analysis

"Checking for contradictions in Holmes's deductions..."

✅ No logical inconsistencies detected

🩺 Dr. Watson's Medical Opinion

VERIFICATION: CONCERNS NOTED ⚠️

"While Holmes's methods are generally sound, I have noted some"
"discrepancies that warrant further investigation. No evidence citations were available for trust analysis."

Respectfully submitted,

  • Dr. J. Watson, M.D.
    Medical Examiner & Verification Specialist

🧵 Command Run

  • Run ID: run-0072d96f-6622-423e-856e-ace2bc3250c3
  • Transmutation: watson-verify
  • Command: verify
  • Status: completed
  • Ledger: /home/runner/work/wesley/wesley/test/fixtures/examples/.wesley-cache/ledger

🔮 Professor MORIARTY full report (click to expand)

🧠 Professor Moriarty's Temporal Predictions

The Mathematics of Inevitability

  • Analysis Date: 2026-05-27T02:18:28.220Z

INSUFFICIENT DATA

"I require at least two data points to predict the future."
"Run Wesley generate multiple times to build history."

🧵 Command Run

  • Run ID: run-f8364f59-6010-465e-8720-dd107f73dc25
  • Transmutation: moriarty-predict
  • Command: predict
  • Status: completed
  • Ledger: /home/runner/work/wesley/wesley/test/fixtures/examples/.wesley-cache/ledger

Machine-readable reports: holmes-report.json · watson-report.json · moriarty-report.json (see workflow artifacts).


Filed at 221B Repository Street

@flyingrobots flyingrobots merged commit 8feb49d into main May 27, 2026
20 checks passed
@flyingrobots flyingrobots deleted the holmes-validation-gate-slices-11-15 branch May 27, 2026 02:19
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69391b2a5b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

);
}

if let Some(schema_version) = artifact.schema_version.as_deref() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate missing artifact schema versions

When a bundle omits schemaVersion on a referenced artifact, this branch simply skips the registry instead of emitting HlawSchemaVersionMissing; the bundled ArtifactRef::new(...) path used by the clean fixture therefore validates as Valid even though HLAW-046/HIMP-014 require every artifact family to fail closed on missing or unsupported versions. This lets the first validation gate accept law diff/coverage/capability/manifest artifacts whose format is unknown, so later ingest can parse incompatible evidence rather than returning a deterministic validation diagnostic.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant