A Lean 4 tactic that looks for counterexamples to your goal before you commit to proving it.
Design sketch. Nothing is implemented yet.
Pause at the current goal, generate concrete instances of its quantified
variables, and try to refute it. The tactic never closes a goal. It only
reports ok up to <bound>, counterexample: <term>, or
inconclusive: <reason>.
example (G : SimpleGraph (Fin n)) (h : G.IsConnected) : G.chromaticNumber ≤ 4 := by
sanity -- searches small n, returns "no counterexample up to n = 7"
sorryexample (n : ℕ) (h : n > 2) : ∀ a b c : ℕ, a^n + b^n ≠ c^n := by
sanity (bound := 50) -- fuzzes; finds nothing; that is information, not a proof
sorryexample (p q : Prop) : (p → q) → (¬q → ¬p) := by
sanity -- dispatches to SMT, reports valid
tauto-
Lean tactics. A free pre-tier of in-process decision procedures:
decideandnative_decidefor anything reducible to aDecidableinstance,norm_numfor ground numerical equalities and inequalities,linarithandpolyrithfor linear and polynomial arithmetic over ordered fields,positivityfor sign goals. No external round-trip, no parsing, no IPC. Always try these before paying any cross-process cost. -
Enumeration. For finite or finitely-bounded types, walk inhabitants up to a size bound and evaluate the conclusion using Lean's own reduction (
Lean.Meta.reduce,Lean.Meta.whnf) or the compiled code viaLean.Elab.Term.evalTerm. Cheapest possible refutation strategy: no translation, no foreign solver, justdecide-style brute force. Catches the embarrassing cases — off-by-one indices, swapped argument orders, identities that are false atn = 0— in milliseconds. -
Property-based testing. For non-finite or large types where exhaustive enumeration runs out fast, switch to randomised generators with shrinking: QuickCheck in the Haskell tradition, Hypothesis in Python. Strictly stronger than the bounded enumerator: the random sampler covers a much wider region of the input space than any size-
kenumeration, and the shrinker minimises whatever counterexample it finds, so the user sees[1, 0]rather than a 200-element list. -
SMT. For goals in a decidable theory — linear arithmetic over integers and rationals, bitvectors, uninterpreted functions, arrays, propositional logic — translate to SMT-LIB and shell out to
z3orcvc5. Parse the returned model back into Lean syntax: the user should seen = 3, xs = [0, 0, 1], not(define-fun n () Int 3) (define-fun xs () (Array Int Int) ...). The translator is the hard part; once it is honest, the solver does the work. -
Nonlinear arithmetic. Polynomial inequalities and equalities over the reals fall outside the linear-arithmetic fragment but are still decidable via real-closed-field decision procedures. Z3 NLSAT is the most accessible option; MathSAT is a strong alternative; QEPCAD implements cylindrical algebraic decomposition for the harder cases; RedLog handles quantifier elimination over real-closed and discretely-valued fields.
-
Interval numerics. Analytic inequalities (claims about
sin,exp,log, definite integrals) are out of reach of every symbolic procedure but trivially refutable by evaluating at sampled points using rigorous interval arithmetic. Arb and FLINT provide ball arithmetic with certified error bounds; mpmath is the Python equivalent. When the interval evaluation excludes the claimed bound, you have a sound refutation, not a heuristic one. -
First-order ATPs. When the goal is logical but does not fit any decidable fragment — typed first-order logic with quantifier alternation, equality, axiom schemata — hand it to a saturation-based prover: Vampire, E, or iProver. They speak a common input format (TPTP), so one translator targets all three. Useful as a "Sledgehammer-lite" oracle for goals SMT cannot encode.
-
Computer algebra. Closed-form claims about sums, integrals, identities, special functions, and ODE solutions are the bread and butter of a CAS: SageMath, Mathematica, or Maple. SageMath is the natural unified front-end — under one Python interface it bundles GAP (finite groups, characters, permutation actions), Singular (commutative algebra, Gröbner bases), Macaulay2 (algebraic geometry), and PARI/GP (number theory, elliptic curves, modular forms). One adapter, most of the algebra-system territory.
-
Graphs. Brute enumeration of
SimpleGraph (Fin n)is grossly wasteful — the number of graphs grows like2^(n choose 2), but the number of isomorphism classes grows much more slowly. Nauty'sgengproduces one representative per isomorphism class up ton ≤ 10exhaustively;genrangsamples uniformly at largern. Same machinery coversSimpleGraph V,Multigraph, andDigraphwith appropriate flags. -
Constraint solvers. For combinatorial existence statements that are not naturally graph-shaped — Latin squares, combinatorial designs, schedulings, set systems — encode the problem in a high-level constraint language and let a solver enumerate or refute models. MiniZinc is the standard front-end; Google's OR-Tools ships CP-SAT and is extremely fast on industrial problem sizes.
-
Database lookup. Some counterexamples already have names, and the cheapest possible "search" is a query against a curated list. π-Base catalogues topological spaces and their properties; House of Graphs catalogues interesting graphs; GAP's SmallGroups library enumerates every finite group up to order 2000; OEIS catalogues integer sequences. When this tier hits, it returns a named counterexample (the Sorgenfrey line, the Petersen graph, the Klein four-group) — vastly more useful to a human reader than a raw model.
Cost profiles differ wildly; don't treat backends as interchangeable. Fuzzing is shallow and free, SMT is a heavy hammer that succeeds or fails quickly, Nauty is the right answer for one shape of question and useless for everything else.
structure SanityProblem where
binders : Array (Name × Expr) -- ∀-bound variables and their types
hypotheses : Array Expr -- premises (also evaluated)
goal : Expr -- conclusion to refute
bounds : Bounds -- per-type size limits
budget : Budget -- time / memory capsBounds is where user-facing knobs live: bound := 50, graphSize := 8,
smtTimeout := 2s. Defaults should be aggressive enough that by sanity
returns in under a second on a typical goal.
Bool,Fin n,Nat(bounded). Smoke test.Prop/ decidable propositions. ReuseDecidable.Int,Rat, linear arithmetic. First SMT.List α,Array α,Vector α n.SimpleGraph V. First nauty integration.BitVec n. SMT bitvector logic.- User-defined inductives. Derive
SmallEnum αfrom constructors. - Functions
α → βwith finiteα. Tabulate.
Skip for v1: real numbers, quotient types, anything classical.
class SanityBackend (β : Type) where
name : String
applies : SanityProblem → MetaM Bool
run : SanityProblem → MetaM SanityResultBackends register through an attribute-driven table; see
Lean.Meta.registerSimp* for the pattern. The frontend asks each registered
backend applies?, runs the cheap ones first, and stops on the first
definitive answer. This is the seam that lets a user drop in a backend for
their own domain without forking the tactic.
| [deMoura&Ullrich21] | L. de Moura and S. Ullrich. The Lean 4 Theorem Prover and Programming Language. In: A. Platzer, G. Sutcliffe (eds.), Automated Deduction – CADE 28, Lecture Notes in Computer Science, vol. 12699, pp. 625–635. Springer, Cham, 2021. |
| [Ayers+22] | E. Ayers, W. Nawrocki, J. Limperg, H. Böving, et al. (eds.). Metaprogramming in Lean 4. Online community textbook, Lean Prover Community, 2022–present. https://leanprover-community.github.io/lean4-metaprogramming-book. Accessed April 2026. |
| [Ullrich23] | S. Ullrich. An Extensible Theorem Proving Frontend: Designing the Lean Elaborator. Doctoral dissertation, Karlsruher Institut für Technologie (KIT), Department of Informatics, Karlsruhe, Germany, 2023. |
| [mathlib20] | The mathlib Community. The Lean Mathematical Library. In: J. Blanchette, C. Hriţcu (eds.), Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2020), New Orleans, LA, USA, January 20–21, 2020, pp. 367–381. Association for Computing Machinery, New York, NY, USA, 2020. ISBN 978-1-4503-7097-4. |
| [lean-smt] | UFMG SMITE Lab and contributors. lean-smt: SMT-LIB Integration for Lean 4. Open-source software repository, Universidade Federal de Minas Gerais, 2022–present. https://github.com/ufmg-smite/lean-smt. Accessed April 2026. |
| [deMoura&Bjørner08] | L. de Moura and N. Bjørner. Z3: An Efficient SMT Solver. In: C.R. Ramakrishnan, J. Rehof (eds.), Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2008), Lecture Notes in Computer Science, vol. 4963, pp. 337–340. Springer, Berlin, Heidelberg, 2008. |
| [Barbosa+22] | H. Barbosa, C. Barrett, M. Brain, G. Kremer, H. Lachnitt, M. Mann, A. Mohamed, M. Mohamed, A. Niemetz, A. Nötzli, A. Ozdemir, M. Preiner, A. Reynolds, Y. Sheng, C. Tinelli, Y. Zohar. cvc5: A Versatile and Industrial-Strength SMT Solver. In: D. Fisman, G. Rosu (eds.), Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2022), Lecture Notes in Computer Science, vol. 13243, pp. 415–442. Springer, Cham, 2022. |
| [Jovanović&deMoura12] | D. Jovanović and L. de Moura. Solving Non-linear Arithmetic. In: B. Gramlich, D. Miller, U. Sattler (eds.), Automated Reasoning (IJCAR 2012), Lecture Notes in Computer Science, vol. 7364, pp. 339–354. Springer, Berlin, Heidelberg, 2012. |
| [Cimatti+13] | A. Cimatti, A. Griggio, B.J. Schaafsma, R. Sebastiani. The MathSAT5 SMT Solver. In: N. Piterman, S.A. Smolka (eds.), Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2013), Lecture Notes in Computer Science, vol. 7795, pp. 93–107. Springer, Berlin, Heidelberg, 2013. |
| [Kovács&Voronkov13] | L. Kovács and A. Voronkov. First-Order Theorem Proving and Vampire. In: N. Sharygina, H. Veith (eds.), Computer Aided Verification (CAV 2013), Lecture Notes in Computer Science, vol. 8044, pp. 1–35. Springer, Berlin, Heidelberg, 2013. |
| [Schulz13] | S. Schulz. System Description: E 1.8. In: K. McMillan, A. Middeldorp, A. Voronkov (eds.), Logic for Programming, Artificial Intelligence, and Reasoning (LPAR-19), Lecture Notes in Computer Science, vol. 8312, pp. 735–743. Springer, Berlin, Heidelberg, 2013. |
| [Sutcliffe17] | G. Sutcliffe. The TPTP Problem Library and Associated Infrastructure: From CNF to TH0, TPTP v6.4.0. Journal of Automated Reasoning, 59(4):483–502, 2017. Springer Netherlands, Dordrecht. ISSN 0168-7433. |
| [Claessen&Hughes00] | K. Claessen and J. Hughes. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. In: Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming (ICFP '00), Montreal, Canada, September 18–20, 2000, ACM SIGPLAN Notices, 35(9):268–279. Association for Computing Machinery, New York, NY, USA, 2000. ISBN 1-58113-202-6. |
| [MacIver+19] | D.R. MacIver, Z. Hatfield-Dodds, and many contributors. Hypothesis: A New Approach to Property-Based Testing. Journal of Open Source Software, 4(43):1891, 2019. The Open Journal. ISSN 2475-9066. |
| [McKay&Piperno14] | B.D. McKay and A. Piperno. Practical Graph Isomorphism, II. Journal of Symbolic Computation, 60:94–112, 2014. Elsevier (Academic Press), London. ISSN 0747-7171. |
| [Brinkmann+13] | G. Brinkmann, K. Coolsaet, J. Goedgebeur, H. Mélot. House of Graphs: A Database of Interesting Graphs. Discrete Applied Mathematics, 161(1–2):311–314, 2013. Elsevier (North-Holland), Amsterdam. ISSN 0166-218X. |
| [piBase] | J. Dabbs and contributors. π-Base: A Community Database of Topological Spaces. Online database, 2014–present. https://topology.pi-base.org. Accessed April 2026. |
| [OEIS] | N.J.A. Sloane (founder and editor-in-chief). The On-Line Encyclopedia of Integer Sequences® (OEIS®). Published electronically by the OEIS Foundation Inc., Highland Park, NJ, USA, 1964–present. https://oeis.org. Accessed April 2026. |
| [Sage] | The Sage Developers. SageMath, the Sage Mathematics Software System (Version 10.4). Manual and software distribution, SageMath project, 2024. https://www.sagemath.org. Distributed under the GNU General Public License v3. |
| [GAP] | The GAP Group. GAP — Groups, Algorithms, and Programming, Version 4.13.0. Manual and software distribution, 2024. https://www.gap-system.org. Distributed under the GNU General Public License v2. |
| [Singular] | W. Decker, G.-M. Greuel, G. Pfister, H. Schönemann. Singular 4-3-2 — A Computer Algebra System for Polynomial Computations. Manual and software distribution, University of Kaiserslautern (Rheinland-Pfälzische Technische Universität Kaiserslautern–Landau), Department of Mathematics, Kaiserslautern, Germany, 2023. https://www.singular.uni-kl.de. |
| [PARI/GP] | The PARI Group. PARI/GP, Version 2.15.5. Manual and software distribution, Université de Bordeaux, Institut de Mathématiques de Bordeaux, Bordeaux, France, 2024. https://pari.math.u-bordeaux.fr. |