Add tested examples, improvement plan, and updated docs for all Python use cases by Copilot · Pull Request #2 · sorunokoe/PureReason

Copilot · 2026-06-02T07:13:16Z

Summary

Add 6 tested example files covering every Python API surface, rewrite examples/README.md, and create a phased improvement plan based on hands-on exploration.

Type

Description

The existing examples were generic and untested — simple_verification.py showed 3 hardcoded claims, and the README referenced verify_chain(llm_output) with the wrong call signature (missing steps param). New consumers had no way to understand the full API surface or validate their setup.

New examples (`examples/`)

guard_verification.py — ReasoningGuard ECS scoring, threshold tuning, arithmetic repair, degradation tracking
chain_of_thought.py — verify_chain multi-step reasoning with contradiction detection
arithmetic_solver.py — solve_arithmetic, _safe_eval, _extract_numbers, _detect_operation
syllogism_verification.py — verify_syllogism with Z3 + heuristic fallacy detection
mcq_picker.py — pick_best_answer with strict/lenient tie handling
arithmetic_repair.py — _repair_arithmetic_in_step, answer extraction, majority vote

Tests

tests/test_examples.py — 36 tests covering all example use cases (mocked where Rust binary needed, direct for pure-Python)

Docs

examples/README.md — Full rewrite: per-use-case sections with code, expected output, edge cases, install prerequisites per example group
README.md — Fixed wrong verify_chain signature, added optional dependency install docs, linked new examples table
docs/IMPROVEMENT-PLAN.md — Findings from exploration + 5-phase roadmap (robustness → test coverage → API ergonomics → performance)

Bugs fixed

README showed verify_chain(llm_output) — correct signature is verify_chain(problem, steps)
README referenced result.ecs on EpistemicChainReport (doesn't exist; it's result.chain_confidence)

CI fixes

Fixed ruff F541 lint error: removed extraneous f prefix from f" ---" in examples/syllogism_verification.py
Fixed ruff RUF059 lint error: renamed unused unpacked variable report → _report in tests/test_examples.py
Fixed ruff formatting issues in examples/arithmetic_solver.py, examples/guard_verification.py, and tests/test_examples.py
Added missing click dependency to CI install step in .github/workflows/lint.yml (required by spacy but not resolved transitively)

Testing

Python unit tests pass: python3 -m unittest discover -s tests -p "test_*.py" -v
Ruff clean: ruff check . && ruff format --check .
Rust tests pass (if Rust code changed): cargo test
Benchmark regression checked (if detection logic changed):
python3 benchmarks/run_benchmarks.py

Benchmark impact (detection PRs only)

N/A — no detection logic changed.

Checklist

No LLM, API key, or external model dependency introduced
All Python files ≤ 400 lines (exceptions need justification)
Public API documented (docstrings / /// for Rust)
No commented-out code left in
.github/CONTRIBUTING.md consulted for code style guidelines

Related issues

- Add 6 new example files covering all Python use cases: guard_verification.py, chain_of_thought.py, arithmetic_solver.py, syllogism_verification.py, mcq_picker.py, arithmetic_repair.py - Add tests/test_examples.py with 36 tests validating all examples - Rewrite examples/README.md with per-use-case documentation - Create docs/IMPROVEMENT-PLAN.md with findings and roadmap - Update README.md with accurate API references, install docs, and links to new examples and improvement plan

sorunokoe · 2026-06-02T07:40:07Z

@copilot Fix issues in CI.

…lick dependency

Copilot · 2026-06-02T07:43:26Z

@copilot Fix issues in CI.

Fixed all CI issues:

Ruff F541 — removed extraneous f prefix from f" ---" in examples/syllogism_verification.py:109
Ruff RUF059 — renamed unused report to _report in tests/test_examples.py:311
Ruff format — auto-formatted 3 files that had formatting drift
Missing click — added click to the CI install step in .github/workflows/lint.yml (spacy requires it but wasn't resolving it transitively)

Initial plan

929cce9

Copilot AI assigned Copilot and sorunokoe Jun 2, 2026

Copilot started work on behalf of sorunokoe June 2, 2026 07:13 View session

Copilot AI linked an issue Jun 2, 2026 that may be closed by this pull request

Explore use cases #1

Open

Copilot AI added 3 commits June 2, 2026 07:18

Initial exploration - no changes yet

8209aec

Address review feedback: clarify comments and install docs

5eb5ae4

Copilot AI changed the title ~~[WIP] Update use cases with detailed report and examples~~ Add tested examples, improvement plan, and updated docs for all Python use cases Jun 2, 2026

Copilot finished work on behalf of sorunokoe June 2, 2026 07:27

Copilot AI requested a review from sorunokoe June 2, 2026 07:27

Copilot started work on behalf of sorunokoe June 2, 2026 07:40 View session

Fix CI: ruff lint errors (F541, RUF059), format issues, and missing c…

d96e81c

…lick dependency

Copilot finished work on behalf of sorunokoe June 2, 2026 07:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tested examples, improvement plan, and updated docs for all Python use cases#2

Add tested examples, improvement plan, and updated docs for all Python use cases#2
Copilot wants to merge 5 commits into
mainfrom
copilot/explore-use-cases

Copilot AI commented Jun 2, 2026 •

edited

Loading

Uh oh!

sorunokoe commented Jun 2, 2026

Uh oh!

Copilot AI commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type

Description

New examples (examples/)

Tests

Docs

Bugs fixed

CI fixes

Testing

Benchmark impact (detection PRs only)

Checklist

Related issues

Uh oh!

sorunokoe commented Jun 2, 2026

Uh oh!

Copilot AI commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jun 2, 2026 •

edited

Loading

New examples (`examples/`)