Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 64 additions & 53 deletions .cursor/rules/api-key-controls.mdc

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions .cursor/rules/developer-mode-gates.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
description: Developer mode gating for hidden frontend controls
alwaysApply: false
---

# Developer Mode Gates

Developer mode is a hidden frontend switch toggled by pressing `Shift + Z + X`. The state is stored in localStorage as `developerModeSettingsEnabled` and should stay centralized through `developerModeEnabled` props or the same storage key.

Known gated surfaces:
- `LeanOJ Proof Solver` appears in the mode selector only while developer mode is enabled. If developer mode is turned off while `appMode === 'leanoj'`, the app should return to autonomous mode.
- Supercharge checkboxes in Aggregator, Compiler, Autonomous, LeanOJ, and related role settings are developer-only. Start/request payloads must force `supercharge_enabled` false unless developer mode is currently enabled.
- Creativity Emphasis Boost checkboxes beside Aggregator, Autonomous, and LeanOJ start controls are developer-only for now. Start/request payloads must force `creativity_emphasis_boost_enabled=false` unless developer mode is currently enabled; keep this gate isolated so the feature can later move to standard mode cleanly.
- Raw JSON settings editors and developer-only model/settings toggles are developer-only and should close or hide when developer mode is disabled.
- Boost copy may mention Supercharge only when developer mode is enabled.

When adding a new hidden or experimental frontend control, gate both the visible UI and any request/runtime payload field behind developer mode. Do not rely on hiding the checkbox alone.
69 changes: 37 additions & 32 deletions .cursor/rules/hosted-web-contract.mdc

Large diffs are not rendered by default.

165 changes: 84 additions & 81 deletions .cursor/rules/json-prompt-design.mdc

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions .cursor/rules/latex-renderer.mdc
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
---
description: LaTeX renderer security, KaTeX parsing, and large-document rendering requirements
alwaysApply: false
---

Expand Down Expand Up @@ -76,16 +77,15 @@ Dual rendering: **Rendered LaTeX View** (KaTeX math, dark theme on screen, white

## Rendering Pipeline (CRITICAL ORDER)

Must execute in this exact order in `renderLatexToHtml()`:
The raw conversion inside `renderLatexToHtml()` may include targeted preprocessing repairs between these anchors. Preserve these order constraints rather than treating every intermediate fix as a rule-level invariant:

1. **`decodeHtmlEntities()`** — FIRST
2. **`autoWrapMath()`** — Auto-wrap unwrapped math
3. **`processTheoremEnvironments()`** — TikZ handling happens HERE (all three patterns: `\[...\]`, `$$...$$`, standalone)
4. **`replaceSectionCommand()`** — Section headers
5. **Text formatting, citations, footnotes, lists, tables, QED symbols**
6. **KaTeX rendering** via `renderKatexSafely()` — `maxExpand: 5000`, skips HTML placeholder content
7. **Line breaks/horizontal rules** (`\\` → `<br/>`, `\hrule` → `<hr/>`) — AFTER KaTeX
8. **DOMPurify sanitization** — LAST
2. Apply renderer-specific text/LaTeX repairs and auto-wrap unwrapped math before theorem/section/text formatting
3. **`processTheoremEnvironments()`** — TikZ handling happens before KaTeX (all three patterns: `\[...\]`, `$$...$$`, standalone)
4. **KaTeX rendering** via `renderKatexSafely()` — `maxExpand: 5000`, skips HTML placeholder content
5. **Line breaks/horizontal rules** (`\\` → `<br/>`, `\hrule` → `<hr/>`) — AFTER KaTeX

DOMPurify sanitization is intentionally outside `renderLatexToHtml()` and must run immediately after it at every HTML sink (`renderedHtmlSmall`, `RenderedChunk.renderedHtml`, and PDF helper rendering).

**Critical:** `\\` line break conversion MUST be after KaTeX (valid syntax in `aligned`, `matrix`, etc.)

Expand All @@ -99,7 +99,7 @@ Must execute in this exact order in `renderLatexToHtml()`:

**Auto-threshold**: Documents >50K chars (`LARGE_DOC_THRESHOLD`) auto-default to raw mode with a banner offering to switch to rendered view.

**Invariant**: Each chunk independently runs the full `renderLatexToHtml()` `DOMPurify.sanitize()` pipeline. The pipeline order within each chunk is identical to the single-document path.
**Invariant**: Each chunk independently runs `renderLatexToHtml()` and then `DOMPurify.sanitize()`. Chunked and single-document paths must share the same conversion helpers and ordering anchors.

---

Expand All @@ -111,7 +111,7 @@ Must execute in this exact order in `renderLatexToHtml()`:

**`sanitizeFilename()`**: Remove special chars, underscores for whitespace, truncate to 100 chars.

**Backend PDF route**: `POST /api/download/pdf` — accepts `{html_body, title, word_count, date, models, outline, filename}`. Builds a standalone HTML document (KaTeX + LatexRenderer CSS both inlined from local filesystem + PDF print overrides that convert dark theme to light). `wait_until="load"` (no external requests). Runs `sync_playwright()` in `asyncio.get_running_loop().run_in_executor`. Returns `Response(content=pdf_bytes, media_type="application/pdf")`.
**Backend PDF route**: `POST /api/download/pdf` — default/desktop mode accepts `{html_body, title, word_count, date, models, outline, filename}`. Builds a standalone HTML document (KaTeX + LatexRenderer CSS both inlined from local filesystem + PDF print overrides that convert dark theme to light). `wait_until="load"` (no external requests). Runs `sync_playwright()` in `asyncio.get_running_loop().run_in_executor`. Returns `Response(content=pdf_bytes, media_type="application/pdf")`. Generic mode returns `501`; current clients surface that backend unavailability and raw export remains available.

**Playwright install**: `python -m playwright install chromium` — runs automatically in both launcher scripts after `pip install -r requirements.txt`. One-time ~150MB download of bundled Chromium (no system Chrome/Chromium required). Failure is non-fatal (warning shown, startup continues).

Expand Down Expand Up @@ -179,7 +179,7 @@ When a frontend storage prefix is active for multi-instance shared-origin use, t
5. TikZ environments MUST be pre-processed — KaTeX cannot render them; remove surrounding math delimiters (all three patterns)
6. KaTeX maxExpand MUST be 5000
7. Line break conversion MUST happen AFTER KaTeX — prevents SVG corruption
8. Rendering pipeline order MUST be preserved (same order in each chunk)
8. Rendering pipeline ordering anchors MUST be preserved (same helpers/order in each chunk)
9. Raw text view never processes HTML — plain text only
10. PDF generation captures sanitized content
11. Chunks MUST split at safe boundaries (section headers, double-newlines) — never mid-math-environment
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
---
description: Core code-change, rule-update, workflow-mode, and proof-runtime invariants
alwaysApply: true
---

Expand All @@ -16,21 +17,24 @@ alwaysApply: true

6.) For config/preset files with repeated literal values, never patch by replacing a shared literal alone. Anchor edits to the exact object/block being changed and verify the diff only touches the intended target.

7.) Any REST shape, auth contract, WebSocket event, or `/api/features` capability change that affects the web wrapper must update **code, the relevant rule(s), and `api_contract_version` in `/api/features`** in the same approved merge. The live backend's `GET /openapi.json` is the machine-readable REST schema contract.
7.) Any REST shape, auth contract, `/api/features` capability, or web-consumed WebSocket contract change must update **code, the relevant rule(s), and `api_contract_version` in `/api/features`** in the same approved merge. The live backend's `GET /openapi.json` is the machine-readable REST schema contract.

8.) Only ONE workflow mode may be active at a time (Aggregator, Compiler, Autonomous Research, or LeanOJ Proof Solver). This constraint applies identically in both default mode and generic mode. Start conflict checks must be serialized and include pending/background-task activity flags such as `autonomous_coordinator.is_active`, not only persisted `state.is_running` fields.

9.) Lean 4 and SMT features are always gated on `lean4_enabled`, `lean4_lsp_enabled`, and `smt_enabled` runtime flags. All three default false, must stay silent and side-effect-free when disabled, and must never ship Lean or Z3 toolchains or Python wheels into `requirements-generic.txt`, `Dockerfile`, or `docker/entrypoint.sh` (hosted image stays Lean-free and Z3-free). Lean 4 is authoritative formal checking for every stored proof and is necessary for LeanOJ final solutions; SMT contributes hints only. Z3 executable paths are trusted startup/operator configuration only, must be rejected as runtime API input, and must resolve to a `z3`/`z3.exe` executable. Automated proof candidates must directly serve the user prompt, not merely be non-trivial or novel.
8b.) Autonomous Research and Single Paper Writer expose run-level Allowed Outputs (`allow_mathematical_proofs`, `allow_research_papers`); at least one must be true. Both true preserves existing workflow behavior. The Mathematical Proofs checkbox is the user-facing Lean proof-output enable path and must either sync/enable the runtime proof setting or the backend must reject proof-only/proof-requested starts when Lean is unavailable. Disabling papers must not disable brainstorming itself; proof-only autonomous runs must not silently become brainstorm-only loops and must reset durable workflow state to the next topic/exploration boundary after proof work instead of leaving `pre_paper_compilation`. Disabling proofs must skip proof-output work without affecting developer-only creativity boost behavior.

9.) Lean 4 and SMT features are gated by runtime flags: `lean4_enabled` gates Lean proof execution/model proof work, `lean4_lsp_enabled` only gates the optional persistent LSP optimization (subprocess Lean must still work when it is false), and `smt_enabled` gates Z3/SMT hint generation. All three default false; when disabled they must not invoke their corresponding toolchains, spend proof-model calls, or block workflows, and must never ship Lean or Z3 toolchains or Python wheels into `requirements-generic.txt`, `Dockerfile`, or `docker/entrypoint.sh` (hosted image stays Lean-free and Z3-free). Lean 4 is authoritative formal checking for every stored proof and is necessary for LeanOJ final solutions; SMT contributes hints only, and only valid `unsat` SMT checks become suggested Lean tactics. Z3 executable paths are trusted startup/operator configuration only, must be rejected as runtime API input, and must resolve to a `z3`/`z3.exe` executable. Automated proof candidates should directly serve the user prompt and be novelty-first before Lean cost, but once Lean accepts real proof code it must be preserved and novelty-ranked under the actual proved statement; statement mismatch downshifts storage instead of discarding. Do not spend proof attempts building a general known-knowledge base of routine helpers, standard Mathlib/textbook facts, or merely non-trivial background lemmas. LeanOJ final master-proof edits may use standard facts inline to solve the template, but must not accumulate a separate known-knowledge library.

10.) LeanOJ initial topic generation and brainstorm submitters always run in parallel and feed one validator that batch-validates up to 3 topics/submissions. Initial topic candidates/selection must be broad locked foundation questions covering the whole LeanOJ solution route, not narrow sublemma/tactic/local-repair topics. Recursive brainstorming has no separate recursive-topic prepass and must not re-inject the initial selected topic as active steering context; it uses the shared accepted proof-memory database plus the current proof/failure context. Accepted brainstorm memory must preserve occurrence-specific chronological metadata even for duplicate idea text. Never implement active LeanOJ topic or brainstorm phases as round-robin/serial submitter calls; one hung submitter must not halt the phase.
10b.) Developer-enabled LeanOJ Creativity Emphasis Boost applies to every fifth valid queued initial-topic and brainstorm submission per submitter. It only adds optional near-solution/adjacent-solution creativity pressure when apparent and potentially very helpful; validation remains unchanged, accepted/rejected WebSocket payloads mark `creativity_emphasized`, and the block is skipped for that slot if it would overflow the configured prompt budget.

11.) LeanOJ stop/crash/restart is resumable by default. `Clear Progress` / `/api/leanoj/clear?confirm=true` is the only intentional reset path. Start/restart should choose the best matching/resumable persisted session by proof context, not blindly create a new session or pick the latest file.
11.) LeanOJ stop/crash/restart preserves resumable state by default. `Clear Progress` / `/api/leanoj/clear?confirm=true` is the only intentional reset path. Start/restart should choose the best matching/resumable persisted session by proof context, not blindly create a new session or pick the latest file; automatic model-work restart on backend boot is opt-in via runtime config.

12.) LeanOJ OpenRouter credit exhaustion or no-fallback provider configuration errors are non-retryable pauses, not proof-attempt failures. Do not let API credit/config failures inflate final proof attempt loops.
12.) LeanOJ and autonomous proof-check recoverable provider-credit exhaustion should preserve workflow checkpoints and pause rather than become proof-attempt failures. Hard configuration/privacy/missing-key errors should fail visibly with a user-repair path instead of inflating proof attempt loops. `Retry OpenRouter` / `/api/openrouter/reset-exhaustion` wakes currently waiting in-process credit pauses after credits are restored; stopped/restarted runs resume through their persisted LeanOJ/proof checkpoint state.

13.) LeanOJ/RALPH final-proof loop checkpoints may only be user-configurable feedback checkpoints, not hidden loop shutdowns. The durable `master_proof.lean` is the authoritative working draft, and every accepted master-proof edit must pass an in-memory Lean gate before persistence: `needs_more_time=true` runs Lean with `sorry`/`admit` placeholders allowed but still requires parse/typecheck, template preservation, and no fake proof devices; `needs_more_time=false` runs Lean placeholder-free and then final semantic review against the user prompt/template before the run stops as verified. Final-proof mode is edit-only: it must not be offered, shown, or taught `stuck_needs_brainstorm`, raw `need_more_brainstorming`, failed-attempt counts, or any path transition. It may see the most recent 5 final attempts as compact execution feedback (Lean errors, stale edit rejections, JSON truncation, watchdog/no-progress notices) so it can avoid repeating failed edits. Lean/template rejection, semantic-review rejection, conservative no-progress/stale-edit watchdog feedback, and validator rejection of non-progressive shortening edits must preserve the master proof and persist structured continuation feedback; non-user-forced no-progress handoffs should gather recursive brainstorm context before re-entering final mode.
13.) LeanOJ/RALPH final-proof loop checkpoints may be user-configurable feedback checkpoints or conservative no-progress/stale-edit watchdog handoffs; they must not mark success or discard the durable draft. LeanOJ start requests expose configurable phase caps (`max_initial_brainstorm_accepts`, `max_recursive_brainstorm_accepts`, `final_attempts_per_cycle`); `final_attempts_per_cycle` bounds failed final verification/edit attempts before the next path decision/handoff, so accepted `needs_more_time=true` edits can extend a cycle while they keep passing the intermediate Lean gate. The durable `master_proof.lean` is the authoritative working draft, and every accepted master-proof edit must pass an in-memory Lean gate before persistence: `needs_more_time=true` runs Lean with `sorry`/`admit` placeholders allowed but still requires parse/typecheck, template preservation, and no fake proof devices; `needs_more_time=false` runs Lean placeholder-free and then final semantic review against the user prompt/template before the run stops as verified. Final-proof mode is edit-only: it must not be offered, shown, or taught `stuck_needs_brainstorm`, raw `need_more_brainstorming`, failed-attempt counts, or any path transition. It may see the most recent 5 final attempts as compact execution feedback (Lean errors, stale edit rejections, JSON truncation, watchdog/no-progress notices) so it can avoid repeating failed edits. Lean/template rejection, semantic-review rejection, conservative no-progress/stale-edit watchdog feedback, and validator rejection of non-progressive shortening edits must preserve the master proof and persist structured continuation feedback; non-user-forced no-progress handoffs should gather recursive brainstorm context before re-entering final mode.

14.) LeanOJ/RALPH final verification must remain placeholder-free, but Lean-accepted scaffolds containing `sorry`/`admit` should be saved as partial proofs for future context. Partial proofs are citeable incomplete references only; never count them as verified solutions and never accept fake `axiom`/`constant`/`opaque` proof devices.
14.) LeanOJ/RALPH final verification must remain placeholder-free, but Lean-accepted scaffolds containing `sorry`/`admit` and Lean-accepted non-final-ready code should be saved as partial/supporting proofs for future context. Partial/scaffold checks may use subprocess fallback even when LSP mode is enabled. Partial proofs are citeable incomplete references only; never count them as final verified solutions and never accept fake `axiom`/`constant`/`opaque` proof devices. LeanOJ proof-gated brainstorm `lean_proof` submissions are preserved once Lean and integrity checks pass; brainstorm validation can classify usefulness/context role but should not veto the verified artifact.

15.) Parent/user-selected phases have hierarchy precedence over child branches. When a parent phase starts (LeanOJ forced final loop, autonomous paper writing, Tier 3 final answer/final selection), lower-tier brainstorm/topic/path child tasks must stop or be ignored. LeanOJ `Skip Brainstorm` locks the run into the final loop until the configured final-attempt cycle is exhausted; model/path requests for more brainstorming cannot override that user action early. `Force Brainstorm` is a separate explicit user override that returns to recursive brainstorming while preserving proof progress.

Expand Down
Loading
Loading