Summary
5 unsound reductions were merged in PRs #779 and #972 (see #1006). All would have been caught by the /verify-reduction skill (PR #979), which produces Python constructor + adversary scripts with YES/NO instance testing. But the skill wasn't required — the rules went straight from issue description to implementation.
Meanwhile, PR #992 applied /verify-reduction to 34 proposed reductions and caught 8 refuted ones before they were ever implemented. The methodology works; it just wasn't mandatory.
Root Cause
The add-rule and issue-to-pr skills don't require verification before implementation. The implementation pipeline is:
[Rule] issue → issue-to-pr → add-rule → implement → review-structural → merge
The verification pipeline exists separately:
[Rule] issue → verify-reduction → constructor + adversary (YES+NO) → verified artifacts
These two pipelines are disconnected. A rule can be implemented and merged without ever running the verification.
Proposal
Wire /verify-reduction into the implementation pipeline as a mandatory gate:
1. add-rule skill — require NO-instance test
Add to the testing requirements: a test that constructs an infeasible source instance, reduces it, and brute-force verifies the target is also infeasible. This is the minimum bar — 5-10 lines of test code that would have caught all 5 unsound reductions.
#[test]
fn test_source_to_target_no_instance() {
let source = Source::new(/* infeasible instance */);
let reduction = source.reduce_to();
let result = BruteForce::new().find_witness(&reduction.target_problem());
assert!(result.is_none(), "NO instance should have no feasible target solution");
}
2. issue-to-pr skill — run verify-reduction first
When processing a [Rule] issue, check if /verify-reduction artifacts exist. If not, run it before generating the implementation plan. This ensures the mathematical correctness is established before any Rust code is written.
3. review-structural skill — check for NO-instance test
Add to the structural checklist: "Does a test_*_no_instance (or equivalent adversarial test) exist in the test file?"
Evidence
| Approach |
Reductions checked |
Unsound found |
False negative rate |
| Current pipeline (no verification) |
43 (PRs #777-#972) |
5 found post-merge |
12% |
| verify-reduction pipeline |
34 (PR #992) |
8 caught pre-implementation |
0% (none merged) |
Related
Summary
5 unsound reductions were merged in PRs #779 and #972 (see #1006). All would have been caught by the
/verify-reductionskill (PR #979), which produces Python constructor + adversary scripts with YES/NO instance testing. But the skill wasn't required — the rules went straight from issue description to implementation.Meanwhile, PR #992 applied
/verify-reductionto 34 proposed reductions and caught 8 refuted ones before they were ever implemented. The methodology works; it just wasn't mandatory.Root Cause
The
add-ruleandissue-to-prskills don't require verification before implementation. The implementation pipeline is:The verification pipeline exists separately:
These two pipelines are disconnected. A rule can be implemented and merged without ever running the verification.
Proposal
Wire
/verify-reductioninto the implementation pipeline as a mandatory gate:1.
add-ruleskill — require NO-instance testAdd to the testing requirements: a test that constructs an infeasible source instance, reduces it, and brute-force verifies the target is also infeasible. This is the minimum bar — 5-10 lines of test code that would have caught all 5 unsound reductions.
2.
issue-to-prskill — run verify-reduction firstWhen processing a
[Rule]issue, check if/verify-reductionartifacts exist. If not, run it before generating the implementation plan. This ensures the mathematical correctness is established before any Rust code is written.3.
review-structuralskill — check for NO-instance testAdd to the structural checklist: "Does a
test_*_no_instance(or equivalent adversarial test) exist in the test file?"Evidence
Related