[v2] Add configurable compaction retention policy#2407
Conversation
Adds KeepPolicy bitmask (KeepErrors, KeepUserMarked) to control which messages survive compaction. Configurable via keep, compact_ratio, and recent_keep in reasonix.toml.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a2cc76228
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Resolved the merge conflict with main-v2 in |
* ci(e2e-bot): drive the PR head, not main-v2 The bot built reasonix from main-v2, so /e2e on a PR measured main-v2's agent, not the PR's code — worthless for pre-merge validation. Build the agent from the PR head (falling back to main-v2 only when the head predates run --metrics), shrink the e2e context_window to 20000 so the tiny suite actually crosses the compaction trigger, and add a compaction task whose six prose chapters force multi-file reads past the threshold while hiding the graded facts in the first and last chapter. Harness and suite still come from main-v2 so a PR can't weaken its own grader or tests. * feat(run): count compactions in run --metrics and the e2e report The bot needs to show whether a task actually triggered compaction — that's the signal the cache/compaction PRs (#2405-#2407) are measured by. metricsSink counts CompactionStarted; e2ebench adds a Compactions total to the summary and a per-task Compact column. * test(e2e): make the compaction task a sequential clue-chain A single 'read all six files' prompt let the agent batch the reads into one turn, so the whole corpus landed in the kept tail with no foldable middle and compaction never fired. Chaining each chapter to the next forces one read per turn; history accumulates and folds, and a real run now triggers 3 auto-compactions. The final chapter restates the full deliverable so the task stays solvable across a degraded summary. --------- Co-authored-by: reasonix <reasonix@deepseek.com>
|
Running the suite to see how the configurable compaction retention policy affects the /e2e |
🤖 Reasonix e2e benchmarkAccuracy: 4/4 (100%) · Cache hit: 76% · Tokens: 167,099 (prompt 164,874 / completion 2,225) · Compactions: 3 · Cost: ¥ 0.0463
Real provider run. Cache-hit % is cached prompt tokens / total prompt tokens.
|
Summary
Introduce configuration fields for compaction retention behavior, including keep policy, compact ratio, and recent message count.
Root Cause
Long-running sessions need more control over what survives compaction and when compaction triggers.
Technical Approach
Add
KeepPolicyflags, TOML configuration fields, and rendered config hints for keep policy, compaction ratio, and recent keep count.Focused Optimization Points
reasonix.toml.Verification
Not run during this publishing pass. Draft note: reviewers should confirm the keep policy is connected to the actual compaction selection logic before marking ready.