Autolean is a command-line tool that converts a directory of JSON-encoded math problems into Lean 4 files using the llm API, then compiles, repairs, and evaluates semantic fidelity against the original problem. It is designed for repeatability, logging, and easy integration with an existing Lean + Mathlib environment.
By default model calls use OpenRouter API. You can switch model calls to local codex exec
with --use-codex-exec while keeping the rest of the pipeline unchanged.
The core autolean run command converts JSON math problems into compilable Lean 4 theorem statements.
*.json files
|
v
+---------------------+
| Validate JSON |
| (uuid + problem) |
+---------------------+
|
v
+---------------------+
+--->| Phase 5.2: Think | (iteration 1 only)
| | proof strategy, |
| | candidate lemmas |
| +---------------------+
| |
| v
| +---------------------+
| | Phase 5.3: Code |
| | generate Lean 4 |
| | {"lean": "..."} |
| +---------------------+
| |
| v
| +---------------------+
| | Anti-trivialization |
| | guard (reject |
| | `: True`/`: False`) |
| +---------------------+
| |
| v
| +---------------------+
| | Compile |
| | lake env lean ... |
| +---------------------+
| | |
| fail pass
| | |
| v v
| +-----------+ +---------------------+
| | iter < N? | | Semantic eval |
| +-----------+ | grade A-D vs JSON |
| yes | | no +---------------------+
| | | | |
+---------+ | grade >= B grade < B
| | |
v v |
FAIL *.lean iter < N?
written yes | | no
| |
+-----------+ |
| v
| FAIL
|
+---> retry phase 5.3
with eval feedback
The proving scripts take existing Lean files with placeholder proofs (sorry) and attempt to fill in complete proofs.
*.lean with sorry
|
v
+-----------------------------+
| Launch K independent |
| proof attempts in parallel |
+-----------------------------+
| | |
v v v
[Attempt 1] [Attempt 2] ... [Attempt K]
|
v
+-------------------+
| LLM: generate |
| proof body |
+-------------------+
|
v
+-------------------+
| Reject if: |
| - still sorry |
| - frozen changed |
| - new axiom |
+-------------------+
|
v
+-------------------+
| Compile |
+-------------------+
| |
fail pass -------> record pass
|
v
iter < max-iters?
yes | | no
| v
| record fail
v
retry with compiler
feedback (same attempt)
*.lean with sorry
|
v
+-----------------------------+
| Plan round 1 |
| Planner LLM: proof plan |
+-----------------------------+
|
v
+-----------------------------+
| Prove (up to N attempts) |
| Prover LLM + compile loop |
+-----------------------------+
| |
pass all fail
| |
v v
*.lean +---------------------------+
done | rounds < max-plan-rounds? |
+---------------------------+
yes | | no
| v
| FAIL
v
+-----------------------------+
| Replan |
| Planner LLM: new plan |
| using failure report |
+-----------------------------+
|
v
+-----------------------------+
| Prove (up to N attempts) |
| with new plan |
+-----------------------------+
| |
pass all fail
| |
v v
*.lean repeat or FAIL
done
Each input file is a *.json file placed in the input directory (default: problems/).
| Field | Type | Description |
|---|---|---|
uuid |
string |
Identifier for the problem (kept in the prompt as metadata; not used for file naming). |
problem |
string[] |
Array of natural-language statements describing the problem. Lines are concatenated in order. May include LaTeX. |
Any other fields (solution, remark, reference, figures, etc.) are preserved in the prompt but ignored by the parser.
File: problems/set_union_comm.json
{
"uuid": "chapter1/problem-1",
"problem": [
"Prove that A ∪ B = B ∪ A.",
"Assume A, B are sets."
],
"solution": ["ignored by Autolean"],
"remark": []
}A single JSON file can contain multiple sub-problems when problem has more than one entry. Use the split script to break it into individual files before running Autolean:
File: problems/14.json (before splitting)
{
"uuid": "chapter4/problem-14",
"problem": [
"(1) Prove that every closed subset of a compact space is compact.",
"(2) Prove that a compact subset of a Hausdorff space is closed.",
"(3) Prove that the continuous image of a compact set is compact.",
"(4) Prove that a continuous bijection from a compact space to a Hausdorff space is a homeomorphism."
]
}After running python scripts/split_subproblems.py --input-dir problems, this produces:
problems/14_1.json— sub-problem (1)problems/14_2.json— sub-problem (2)problems/14_3.json— sub-problem (3)problems/14_4.json— sub-problem (4)
Autolean recognizes multipart chains by the <name>_<index>.json naming pattern (e.g., 14_1.json, 14_2.json). These are processed sequentially, with prior sub-question context injected into later parts.
Rules:
- Indices must start at
1and be contiguous (no gaps). - A single
<name>_1.jsonwithout further parts is treated as a standalone problem. - If a chain has index gaps, the entire main question is marked failed and skipped.
- If part
kfails, partsk+1...are skipped and Autolean continues with the next main question.
For each *.json file in the input directory:
- Validates required fields (
uuid,problem). For multipart chains, Autolean processes them in order and passes prior sub-question context forward. - Builds a prompt that embeds the full JSON (authoritative).
- Runs phase 5.2 (Thinking) only on iteration 1: derive proof idea and likely lemma calls.
- Runs phase 5.3 (Codex implementation): writes Lean using the phase 5.2 notes.
By default this is formalization-only mode (statement + placeholder
by sorry, not a full solved proof). Autolean rejects trivialized main theorem statements like: Trueor: False. - Compiles the Lean file (configurable command).
- If compilation fails, retries phase 5.3 with compiler feedback up to
--max-iters(reusing iteration-1 phase 5.2 notes). - If compilation succeeds, runs a semantic evaluator that grades alignment with the original problem (
AtoD, default evaluator:openai/gpt-5.2atxhigh). If evaluator output is malformed, Autolean records the parse reason and automatically re-prompts evaluation. Autolean enforces a minimum grade threshold by default (B); lower grades are rejected and retried. - Writes logs for every iteration.
For multipart chains:
- By default, part
k+1does not start until partkis accepted. - By default, each part in a multipart chain must reach grade
Abefore the next part starts (--multipart-min-eval-grade A). - You can disable chain blocking and continue later parts even if an earlier part fails with
--no-multipart-block-on-failure. - You can bypass eval-grade gating during resume/skip checks with
--skip-compiled-ignore-eval-gradeso any currently compiling sub-question is skipped. - You can enable fast resume with
--autopass-eval-ato skip compile-check for files that already have latest eval gradeA. - You can also enable fast resume with
--autopass-has-evalto skip compile-check for files that already have any eval artifact, regardless of grade. - Later parts receive earlier sub-question JSON and earlier Lean formalizations as prerequisite context.
- Deterministic file naming from input filenames (with sanitization and CJK→pinyin transliteration).
- Strict model output contract: JSON object with a
leanstring. - One-time planning layer (
5.2 thinkingat iteration 1) then coding-only repairs (5.3). - Multipart chain mode (
*_1,*_2, ...): sequential dependency, prior-part context injection, and prerequisite theorem reuse. - Statement-only formalization by default (
--formalization-only), with optional full-proof mode (--no-formalization-only). - Hard anti-trivialization guard for main theorem statement (
: True/: Falseis rejected). - Compile-check-repair loop with configurable iterations.
- Post-compile semantic distance grading (
A–D) against the original JSON problem. - Optional policy to reject
sorry. - Resume/skipping when outputs already compile (override with
--force). - Optional parallel workers.
- Detailed per-iteration logs and metadata.
- Python 3.10+.
- OpenRouter API key in variable
AUTOLEAN_API(default). The runner first checks env, then falls back toAUTOLEAN_API=...in~/.zshrc. - Lean 4 + Lake available on PATH in the environment used for compilation.
- Mathlib installed in the Lean project used for compilation (the generated Lean always uses
import Mathlib).
python -m venv .venv
source .venv/bin/activate # Windows PowerShell: .venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
pip install -e ".[dev]"export AUTOLEAN_API="your_api_key"
autolean run --input problems --output FormalizationsIf a JSON file contains multiple sub-problems in problem, you can split it into individual files:
python scripts/split_subproblems.py --input-dir problemsBy default, after successful split the original multi-part source file is deleted.
Use --no-delete-original if you want to keep the source file.
Useful options:
# Preview without writing files
python scripts/split_subproblems.py --input-dir problems --dry-run
# Overwrite existing split files
python scripts/split_subproblems.py --input-dir problems --force
# Write split files into a separate directory
python scripts/split_subproblems.py --input-dir problems --output-dir problems_split
# Keep original multi-part files
python scripts/split_subproblems.py --input-dir problems --no-delete-originalUse this to compile-check all formalizations for a folder and print only failed files.
python scripts/check_lean_formalizations.py \
--input-dir problems \
--formalizations-dir Formalizations \
--cwd /path/to/lean/project \
--compile-cmd "lake env lean {file}" \
--workers 4Notes:
- If
--input-dircontains*.json, expected formalizations areproblem_<json_stem>.lean. - If
--formalizations-diris omitted, the checker auto-uses<input-dir>_Finwhen present. - Exit code is
1when any compilation fails (or when expected files are missing, unless--no-strict-missingis set).
Use this to evaluate existing formalizations with the exact same semantic-eval prompt and parse/retry logic used by autolean run.
Codex Exec (default provider, gpt-5.2 + xhigh):
python scripts/evaluate_lean_formalizations.py \
--input-dir problems \
--formalizations-dir Formalizations \
--out-dir Formalizations/eval_out \
--provider codex-exec \
--eval-model openai/gpt-5.2 \
--reasoning-effort xhigh \
--workers 10OpenRouter API:
python scripts/evaluate_lean_formalizations.py \
--input-dir problems \
--formalizations-dir Formalizations \
--out-dir Formalizations/eval_out \
--provider openrouter \
--eval-model openai/gpt-5.2 \
--reasoning-effort xhigh \
--workers 10Key outputs in --out-dir:
<theorem>.eval.jsonfinal normalized evaluation payload.<theorem>.eval_attemptN_stdout.log/<theorem>.eval_attemptN_stderr.log.<theorem>.eval_prompt.txt(exact prompt used).evaluation_summary.jsonandevaluation_report.txt.
Use this to run a stricter second-pass exactness audit on formalizations that already received a prior A
or double-check A/A result. The script uses OpenRouter by default, with openai/gpt-5.4,
reasoning effort xhigh, and API key env var AUTOLEAN_API.
python scripts/evaluate_strict_aplus.py \
--input-dir problems \
--formalizations-dir Formalizations \
--workers 10Behavior:
- Searches the original problem JSON under
--input-dir. - Matches each JSON file to
problem_<json_stem>.leanunder--formalizations-dir, preserving relative subdirectories. - Only evaluates targets whose latest prior eval is eligible:
- top-level
status=okandgrade=A, or - double-check payload with
both_a_pass=trueand both graders returningA.
- top-level
- Uses a stricter
A+vsArubric:A+means exact match with nothing to change.Ameans still broadly faithful, but with at least one tiny difference that should be edited to become exact.
- Writes outputs under
--formalizations-dir/new_evalby default.
Key outputs in new_eval:
<relative path>/<theorem>.eval.jsonstrict A+/A payload.<relative path>/<theorem>.eval_attemptN_stdout.log/<relative path>/<theorem>.eval_attemptN_stderr.log.<relative path>/<theorem>.eval_prompt.txt.evaluation_summary.jsonandevaluation_report.txt.
Use this to take an existing corpus of Lean files with placeholder proofs, launch several independent LLM proof attempts per theorem, compile-check every attempt, and count how many attempts pass.
OpenRouter API:
python scripts/prove_lean_formalizations.py \
--input-dir Formalizations \
--out-dir proof_runs \
--provider openrouter \
--model openai/gpt-5.2-codex \
--reasoning-effort xhigh \
--attempts 4 \
--max-iters 3 \
--cwd /path/to/lean/project \
--compile-cmd "lake env lean {file}"Codex Exec:
python scripts/prove_lean_formalizations.py \
--input-dir Formalizations \
--out-dir proof_runs \
--provider codex-exec \
--model openai/gpt-5.2-codex \
--reasoning-effort xhigh \
--attempts 4 \
--max-iters 3 \
--cwd /path/to/lean/project \
--compile-cmd "lake env lean {file}" \
--codex-exec-sandbox read-onlyBehavior:
- Searches
--input-dirrecursively for*.lean. - By default only processes files that still contain
sorryoradmit. - Runs
--attemptsindependent workers per theorem in parallel; workers do not share compiler feedback with each other. - Each worker gets its own compile-repair loop up to
--max-iters. - Preserves the file outside the main theorem proof body; only the proof body is regenerated, with optional extra imports if needed.
- Rejects outputs that still contain
sorry/admit, that change frozen file content outside the proof body, or that introduce top-level declarations.
Key outputs in --out-dir:
<relative theorem path without .lean>/attemptK/iterN.candidate.leanfor every generated proof candidate.<relative theorem path without .lean>/attemptK/outcome.jsonfor each worker.<relative theorem path without .lean>/summary.jsonwith pass count and one-shot pass count for that theorem.proof_summary.jsonandproof_report.txtwith corpus-level pass and one-shot pass results.
Use this when you want exactly one theorem-local proving sequence per problem, but still want parallelism across different problems.
Pipeline per theorem:
- Ask a planner model for a proof plan.
- Ask a proving model to generate/repair the proof, feeding compiler output back after each failed compile.
- After
--attempts-before-replanfailed proof attempts, ask the planner for a new plan using the latest failure report. - Run another proving block with the new plan.
- If all plan rounds fail, move on to the next theorem.
Resume behavior:
- Resume is enabled by default.
- If a theorem folder in
--out-diralready hassummary.json, that theorem is treated as finished and skipped. - If a theorem folder has
progress.json, the script resumes from the next unfinished attempt. - If a theorem folder has partial planner/prover artifacts but no
progress.json, the script reconstructs progress from the existing logs/candidates when possible and continues from there. - Use
--no-resumeto force a fresh rerun from attempt 1 for every theorem.
Default models:
- planner:
openai/gpt-5.4withxhigh - prover:
google/gemini-3-flash-previewwithxhigh
Example:
python scripts/prove_lean_formalizations_replan.py \
--input-dir Formalizations \
--out-dir proof_runs_replan \
--planner-model openai/gpt-5.4 \
--planner-reasoning-effort xhigh \
--prover-model google/gemini-3-flash-preview \
--prover-reasoning-effort xhigh \
--attempts-before-replan 5 \
--max-plan-rounds 2 \
--workers 8 \
--cwd /path/to/lean/project \
--compile-cmd "lake env lean {file}"Fresh rerun instead of resuming:
python scripts/prove_lean_formalizations_replan.py \
--input-dir Formalizations \
--out-dir proof_runs_replan \
--no-resume \
--cwd /path/to/lean/project \
--compile-cmd "lake env lean {file}"Key outputs in --out-dir:
<relative theorem path without .lean>/plan_roundK.prompt.txtand corresponding planner stdout/stderr logs.<relative theorem path without .lean>/proof_attemptN.candidate.leanfor every generated proof candidate.<relative theorem path without .lean>/proof_attemptN.compile_stdout.log/.compile_stderr.log.<relative theorem path without .lean>/progress.jsonwith theorem-level checkpoint state for resume.<relative theorem path without .lean>/summary.jsonwith theorem-level pass/fail status and attempt history.proof_summary.jsonandproof_report.txtwith corpus-level results.
Use this to generate two histograms from an existing proof-run directory:
- histogram of total passing attempts per worked-out problem
- histogram of one-shot passing attempts per worked-out problem
Example:
python scripts/plot_proof_histograms.py \
--input-dir proof_runs \
--out-dir proof_runs/histogramsBehavior:
- Searches
--input-dirrecursively for per-problemsummary.json. - Uses only worked-out problems, meaning
pass_count > 0. - Writes
worked_out_attempts_histogram.svgandworked_out_attempts_histogram.png. - Writes
one_shot_worked_out_attempts_histogram.svgandone_shot_worked_out_attempts_histogram.png. - Writes
histogram_data.jsonwith the exact bucket counts used for both plots.
autolean run \
--input problems \
--output Formalizations \
--cwd /path/to/lean/project \
--openrouter-thinking-model openai/gpt-5.2 \
--openrouter-model openai/gpt-5.2-codex \
--openrouter-thinking-reasoning-effort xhigh \
--openrouter-coding-reasoning-effort high \
--max-iters 6 \
--force \
--openrouter-web-search \
--openrouter-web-search-max-results 5 \
--workers 4Post-compile A-D grading still runs automatically in this command using defaults:
--openrouter-eval-model openai/gpt-5.2--openrouter-eval-reasoning-effort xhigh--min-eval-grade B
Common variants:
# Use 4 workers
autolean run --input problems --output Formalizations --workers 4
# Stream compiler output live into log files
autolean run --input problems --output Formalizations --live-logs
# Disable progress bar
autolean run --input problems --output Formalizations --no-progress
# Provide a custom compile command
autolean run --input problems --output Formalizations --compile-cmd "lake env lean {file}"
# Compile inside an existing Lean + Mathlib project
autolean run --input problems --output Formalizations --cwd /path/to/lean/project
# Use a specific OpenRouter model
autolean run --input problems --output Formalizations --openrouter-model anthropic/claude-3.5-sonnet
# Use different models for thinking and coding
autolean run --input problems --output Formalizations \
--openrouter-thinking-model openai/gpt-4o-mini \
--openrouter-model anthropic/claude-3.5-sonnet
# Convenience switch: use Gemini Flash Preview for both 5.2 and 5.3 on OpenRouter
autolean run --input problems --output Formalizations \
--openrouter-gemini-flash-preview
# In this mode, evaluation is double-checked by both Gemini Flash and GPT-5.2 (xhigh),
# and the iteration passes only when both graders return A.
# Enable OpenRouter web search plugin
autolean run --input problems --output Formalizations \
--openrouter-web-search \
--openrouter-web-search-max-results 5
# Use local codex exec instead of OpenRouter API for 5.2/5.3/eval
autolean run --input problems --output Formalizations \
--use-codex-exec \
--codex-exec-sandbox read-only
# Override API key variable name
autolean run --input problems --output Formalizations \
--openrouter-api-key-env MY_CUSTOM_KEY
# Use a specific evaluator model/effort for post-compile A-D grading
autolean run --input problems --output Formalizations \
--openrouter-eval-model openai/gpt-5.2 \
--openrouter-eval-reasoning-effort xhigh
# Allow full-proof generation instead of statement-only mode
autolean run --input problems --output Formalizations --no-formalization-onlyRun autolean --help or autolean run --help to see all options. Key flags:
--input: directory with*.jsonfiles (default:problems).--output: output directory for*.leanfiles (default:Formalizations).--logs: log directory (default:logs).--max-iters: max repair iterations per file (default: 6).--formalization-only/--no-formalization-only: statement-only formalization mode on/off (default: on).--openrouter-model: OpenRouter model identifier for phase 5.3 coding (default:openai/gpt-5.2-codex).--openrouter-thinking-model: optional model identifier for phase 5.2 thinking (iteration 1 only; default: same as--openrouter-model).--openrouter-gemini-flash-preview/--no-openrouter-gemini-flash-preview: convenience switch that sets both phase 5.2 and phase 5.3 OpenRouter models togoogle/gemini-3-flash-preview(default: off). When enabled on OpenRouter path, evaluation is double-checked by Gemini Flash andopenai/gpt-5.2atxhigh, and pass requires gradeAfrom both.--openrouter-thinking-reasoning-effort: reasoning effort for phase 5.2 (default:xhigh).--openrouter-coding-reasoning-effort: reasoning effort for phase 5.3 (default:xhigh).--openrouter-eval-model: OpenRouter model identifier for post-compile semantic evaluation (default:openai/gpt-5.2).--openrouter-eval-reasoning-effort: reasoning effort for post-compile semantic evaluation (default:xhigh).--openrouter-eval-repair-retries: retries when evaluator output is malformed/unparseable (default:2).--min-eval-grade: minimum acceptable evaluation grade (default:B; usenoneto disable).--multipart-min-eval-grade: minimum required grade per part in multipart chains before continuing to the next part (default:A; usenoneto disable).--multipart-block-on-failure/--no-multipart-block-on-failure: block/continue later sub-questions in a multipart chain after an earlier part fails (default: block).--eval-grade-from-output-dir/--no-eval-grade-from-output-dir: during resume/skip checks, read latest evaluation grades from--outputinstead of--logs(default:--logs).--skip-compiled-ignore-eval-grade/--no-skip-compiled-ignore-eval-grade: during resume/skip checks, treat a currently compiling.leanas skippable even when multipart eval-grade gates are configured (default: off).--autopass-eval-a/--no-autopass-eval-a: during resume/skip checks, if latest eval grade is alreadyA, skip without running compile-check (default: off).--autopass-has-eval/--no-autopass-has-eval: during resume/skip checks, if any eval artifact already exists, skip without running compile-check, regardless of grade (default: off).--openrouter-base-url: OpenRouter API base URL (default:https://openrouter.ai/api/v1).--openrouter-api-key-env: variable name for API key lookup (default:AUTOLEAN_API; also checks~/.zshrc).--openrouter-timeout-s: OpenRouter request timeout in seconds (default: 180).--openrouter-max-retries: retries for transient OpenRouter transport/HTTP failures (default: 2).--openrouter-web-search/--no-openrouter-web-search: enable/disable OpenRouter web search plugin for model calls (default: disabled).--openrouter-web-search-engine: optional search engine name passed to OpenRouter web plugin.--openrouter-web-search-max-results: optional max result count passed to OpenRouter web plugin.--use-codex-exec/--no-use-codex-exec: use localcodex execfor model calls (5.2/5.3/eval) instead of OpenRouter API (default: OpenRouter API).--codex-exec-sandbox: sandbox passed tocodex execwhen enabled (read-only,workspace-write,danger-full-access; default:read-only). When--use-codex-execis enabled, phase 5.3 coding is forced togpt-5.3-codex-sparkwithxhighreasoning effort; if that model is unavailable, Autolean retries withgpt-5.3-codex. OpenRouter-styleopenai/...model IDs are normalized beforecodex execcalls (for exampleopenai/gpt-5.2-codex->gpt-5.2-codex).--compile-cmd: compile command template; must include{file}.--cwd: working directory where compilation runs (your Lean project root with Mathlib configured).--require-no-sorry: reject outputs that containsorryand retry. Incompatible with default--formalization-only; use--no-formalization-onlyif you require nosorry.--workers: parallel workers (1 disables parallelism).--live-logs: stream compiler subprocess output to log files while running.--force: re-run even if a prior output already compiles.
Exit codes:
0if all problems succeeded.1if any problem failed.
- Output theorem name:
problem_<sanitized filename>. - Output file path:
<output_dir>/problem_<sanitized filename>.lean(uses the JSON filename stem). - Sanitization replaces spaces, slashes, and punctuation with underscores, collapses repeats, and transliterates Chinese characters to pinyin (via
pypinyin).
The generated Lean files always start with:
import Mathlib
namespace FormalizationsSo the compile environment must have Mathlib configured. Use:
--cwdto point at a Lean project that includes Mathlib, or--compile-cmdto target your own compile wrapper.
The compile command is split via shlex and run without a shell to avoid injection risks.
Per problem and iteration, Autolean writes:
- Phase 5.2 thinking stdout/stderr logs (full model output on iteration 1; skipped marker on later iterations).
- Phase 5.3 coding stdout/stderr logs.
- Compiler stdout/stderr logs.
- Evaluation stdout/stderr logs and parsed
A-Dpayload JSON (on compile-success iterations). - A metadata JSON file with prompt hash and return codes.
Logs live under the directory passed to --logs (default: logs/).
If an output .lean already exists and compiles cleanly (and passes the --require-no-sorry policy, if enabled), Autolean skips that problem. Use --force to re-run.
You can also skip without compile-check by enabling --autopass-eval-a when prior eval is already A.
Or skip without compile-check by enabling --autopass-has-eval when any prior eval artifact exists.
- Missing API key: set
AUTOLEAN_API(or your chosen--openrouter-api-key-env) in env or~/.zshrc. codex: command not foundwhen using--use-codex-exec: install Codex CLI and ensurecodexis on PATH.- OpenRouter HTTP errors in phase 5.2: inspect
logs/*.thinking_stderr.logandlogs/*.thinking_stdout.log. - OpenRouter HTTP errors in phase 5.3: inspect
logs/*.coding_stderr.logandlogs/*.coding_stdout.log. - OpenRouter HTTP/parse errors in evaluation: inspect
logs/*.eval_stderr.log,logs/*.eval_stdout.log, andlogs/*.eval.json. - Compiler fails with missing Mathlib: point
--cwdto a project with Mathlib configured. - Non-compiling outputs: inspect the corresponding
logs/*.compile_stderr.logfor details. - Unicode path issues: use ASCII-friendly filenames or sanitize filenames upstream.
Run tests:
pytest -qLint/format:
ruff check .
ruff format .src/autolean/: core library + CLI.scripts/: batch utilities (compile check, evaluation, proof completion, splitting, histograms).specs/: requirements and engineering standards.
- This project does not ship a Lean environment; you must provide one.
- Generated outputs and logs are typically not committed.