[codex] Add quotient search formalization, windowed probes, and paper draft#11
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces two new Lean modules, FiniteQuotientSearch.lean and QuotientConstraints.lean, which formalize version-space reduction under a quotient contract and the structural consequences of quotient/factorization contracts, respectively. The root module, verification files, and documentation are updated to integrate these new layers. The review feedback focuses on simplifying the formalization, with actionable suggestions to remove unnecessary typeclass constraints, eliminate redundant classical tactics, and streamline proofs using term-style definitions or definitional equalities.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Review Summary by QodoAdd quotient-search formalization, windowed probes, and comprehensive paper
WalkthroughsDescription• Add comprehensive quotient-search formalization layer with 11 new Lean modules covering fiber invariance, version-space reduction, observation windows, and profile stability • Introduce QuotientConstraints.lean for generic fiber-invariance and non-identifiability consequences of quotient contracts • Introduce FiniteQuotientSearch.lean for version-space reduction under quotient contracts: sample consistency, collision falsification, bucket agreement, and refinement • Introduce FiniteObservationWindow.lean for generic lossy-probe families and finite prefix windows with target-invariance and separating-window existence theorem • Introduce ProfileStability.lean to package finite score certificates for encoder distortion and quotient geometry, composing them into margin-based admission and kernel-containment theorems • Add supporting modules: FiniteQuotientImage.lean, QuotientKernel.lean, FiniteFiberTopology.lean, FinitePairQuotient.lean, QuotientRefinementKernel.lean, ScoreMarginQuotient.lean • Add comprehensive formalization paper under docs/paper/ with TeX source, rendered PDF, plain-text extraction, and README documenting finite quotient sufficiency, search constraints, and profile stability • Wire new modules into root imports, verification checks, theorem map, proof spine, reviewer brief, RAG guide, and README • Preserve finite, conditional boundaries: no claims about manifest schemas, real-encoder theorems, or global probe-family sufficiency without finite evidence Diagramflowchart LR
QC["QuotientConstraints<br/>fiber-invariance"]
FQI["FiniteQuotientImage<br/>reachable image"]
QK["QuotientKernel<br/>kernel containment"]
FT["FiniteFiberTopology<br/>fiber structure"]
FQS["FiniteQuotientSearch<br/>version-space reduction"]
FPQ["FinitePairQuotient<br/>product quotients"]
QRK["QuotientRefinementKernel<br/>refinement via kernels"]
FOW["FiniteObservationWindow<br/>target-separating windows"]
SMQ["ScoreMarginQuotient<br/>threshold margins"]
PS["ProfileStability<br/>composed certificates"]
QC --> FQI
QC --> QK
QC --> FT
FQI --> FQS
QK --> FQS
FT --> FQS
FQS --> FPQ
QK --> QRK
FOW --> SMQ
SMQ --> PS
FQS --> PS
QRK --> PS
File Changes1. OrdvecFormalization/FiniteQuotientSearch.lean
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fdc50b1373
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR extends the OrdvecFormalization codebase with a new “quotient search / reachable image / kernel / windowed probes / profile stability” layer, and updates the public verification surface and documentation (including a LaTeX paper draft) to reflect the expanded theorem surface.
Changes:
- Add new Lean modules for quotient search constraints, reachable-image/kernel reasoning, finite fiber topology, pair quotients, refinement-by-kernel, observation windows, and score-margin/profile-stability theorems.
- Wire the new modules into the root import and
Verify.leanpublic API/axiom auditing dashboard. - Update repo docs (README, theorem map, proof spine, reviewer brief, RAG guide) and add a
docs/paper/directory with the paper source + extracted text.
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Updates repository claims/boundaries to include quotient-search and windowed-probe layers. |
| OrdvecFormalization.lean | Imports the newly added quotient/window/profile modules into the root module. |
| OrdvecFormalization/Verify.lean | Extends the public API verification dashboard with #check/#print axioms for the new layers. |
| OrdvecFormalization/FiniteQuotientSearch.lean | Adds finite sample consistency, falsifiers, bucket-level rule extension, counting, and refinement/product lemmas. |
| OrdvecFormalization/QuotientConstraints.lean | Adds generic fiber-invariance and non-identifiability consequences of quotient factorization. |
| OrdvecFormalization/FiniteQuotientImage.lean | Introduces reachable-image quotients and basic fiber/bucket vocabulary. |
| OrdvecFormalization/QuotientKernel.lean | Defines kernels/target-kernels and proves kernel containment ↔ image-quotient factorization. |
| OrdvecFormalization/FiniteFiberTopology.lean | Adds sample fiber “clump” theorems, collision/injectivity characterizations, and counting identities. |
| OrdvecFormalization/FinitePairQuotient.lean | Adds product quotients for query/document style pair reasoning and sample falsifier. |
| OrdvecFormalization/QuotientRefinementKernel.lean | Connects quotient refinement to kernel inclusion (with surjectivity/image-quotient specializations). |
| OrdvecFormalization/FiniteObservationWindow.lean | Adds observation-family/window joint-code API, target-invariance characterizations, refinement monotonicity, and finite separating-window existence results. |
| OrdvecFormalization/ScoreMarginQuotient.lean | Adds score-approximation + margin-stability theorems for threshold admission and image-quotient factorization. |
| OrdvecFormalization/ProfileStability.lean | Packages distortion/geometry “certificates” and composes them into threshold equality/factorization/kernel containment. |
| docs/theorem-map.md | Expands theorem map to include quotient-search/window/profile layers and updates public-name list. |
| docs/proof-spine.md | Updates proof narrative/dependency spine to incorporate the new layers. |
| docs/reviewer-brief.md | Updates reviewer guidance to mention the new modules and empirical falsifier checks. |
| docs/rag-pipeline-guide.md | Adds joint-code (“multi-probe”) guidance and falsifier framing for practical auditing. |
| docs/paper/README.md | Adds build instructions and artifact list for the formalization paper directory. |
| docs/paper/ordvec_formalization_paper.tex | Adds the LaTeX paper draft capturing the checked theorem surface and boundaries. |
| docs/paper/ordvec_formalization_paper.txt | Adds extracted text for review/search of the paper content. |
| docs/paper/.gitignore | Ignores LaTeX intermediate build artifacts in docs/paper/. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Review-response update pushed in What changed:
Validation run locally:
|
Summary
QuotientConstraints.leanfor generic fiber-invariance and non-identifiability consequences of quotient/factorization contracts.FiniteQuotientSearch.leanfor version-space reduction under quotient contracts: sample consistency, collision falsification, observed-bucket agreement, unobserved-bucket extension, counting facts, refinement, and product targets.FiniteObservationWindow.leanfor generic lossy-probe families and finite prefix windows: joint-code kernels, target invariance, image-quotient factorization, longer-window refinement, finite same-window-code falsifiers, coincidence length, and the finite target-aware separating-window existence theorem.ProfileStability.leanto package supplied finite score certificates for encoder distortion and quotient geometry, then compose them into margin-based admission equality, image-factorization, and kernel-containment theorems.docs/paper/, including the TeX source, rendered PDF, extracted text, README, and local ignore rules for LaTeX intermediates.Closes #13.
Boundary
The profile layer is certificate-based: it assumes finite score-within and quotient-factorization witnesses and proves what follows from those witnesses. It does not claim a manifest schema, empirical estimation procedure, global real-encoder theorem, or surjectivity onto the full theoretical quotient codomain.
The window layer is generic kernel calculus for finite families of observations. It proves that same-joint-code target disagreement falsifies a window-factorized target, and that a finite target-sufficient window exists exactly when every target-disagreeing pair is eventually separated by some coordinate. It does not claim any concrete production probe family is globally sufficient without the finite target-invariance evidence.
The paper is a narrative extraction of the checked theorem surface and its empirical-search consequences. It preserves the finite, conditional boundary: it does not claim that real encoders satisfy the quotient contract, estimate distortion, or prove contrastive training implies quotient invariance.
Citation Notes
The paper cites verified metadata for Fisher (1922), Halmos--Savage (1949), Wald (1950), Blackwell (1953), Karlin--Rubin (1956), Devroye--Györfi--Lugosi (1996), de Moura--Ullrich (2021), and the mathlib Community (2020). Wald's monograph is cited without a DOI rather than reusing a review DOI.
Validation
Latest validation after the windowed-observation layer:
make buildmake verifymake check-doc-namesmake auditmake lintlatexmk -pdf -interaction=nonstopmode -halt-on-error ordvec_formalization_paper.texpdftotext ordvec_formalization_paper.pdf ordvec_formalization_paper.txtgit diff --check