fix(agent-core): cap compaction output tokens when maxOutputSize is undefined#841
Open
li-xiu-qi wants to merge 1 commit into
Open
fix(agent-core): cap compaction output tokens when maxOutputSize is undefined#841li-xiu-qi wants to merge 1 commit into
li-xiu-qi wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 6402aed The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
06b785e to
388f9b2
Compare
…ndefined The compaction worker in full.ts was not passing maxOutputSize to resolveCompletionBudget, causing computeCompletionBudgetCap to fall back to the full context window size as max_completion_tokens. When maxOutputSize is also undefined (the default for most models), this results in max_tokens equal to max_context_tokens, which causes APIContextOverflowError on providers that do not auto-clamp max_tokens server-side. This change: - Passes maxOutputSize to resolveCompletionBudget (aligning with the main loop in index.ts, same as PR MoonshotAI#482) - Adds a conservative fallback cap of min(maxCtx/4, 8192) when maxOutputSize is undefined, ensuring compaction never requests the full context window as output tokens - Adds tests covering the compaction budget resolution scenarios Resolve MoonshotAI#834
388f9b2 to
6402aed
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issue
Resolve #834
Problem
The compaction worker in
full.tsdoes not passmaxOutputSizetoresolveCompletionBudget(). As a result,computeCompletionBudgetCap()always falls back to using the fullmax_context_tokensasmax_completion_tokens— regardless of whethermaxOutputSizeis configured.Since the compaction prompt itself is large (it contains the entire history being compacted),
input_tokens + max_tokensexceeds the context window, and providers that do not auto-clampmax_tokensserver-side reject with400 APIContextOverflowError.Reported in #476, #794, and #834. PR #482 partially addresses this by passing
maxOutputSize, but does not cover the case wheremaxOutputSizeisundefined(the default for most models).What changed
maxOutputSizetoresolveCompletionBudgetinfull.tscompactionRound()— aligns with the main loop inindex.ts(same as PR fix(compaction): pass maxOutputSize to resolveCompletionBudget #482)maxOutputSizeisundefined:Math.min(Math.floor(maxCtx / 4), 8192). Compaction is a summarization operation; 8192 tokens is generous for a high-quality summary while preventing overflow on any provider. This covers the gap left by PR fix(compaction): pass maxOutputSize to resolveCompletionBudget #482.completion-budget.test.tscovering the compaction budget resolution scenarioscompaction-overflow-verification.test.ts) that demonstrates the overflow before and after the fix using three real model configurationsChecklist
gen-changesetsskill.gen-docsskill, or this PR needs no doc update.Verification
Unit tests
npx vitest run packages/agent-core/test/utils/completion-budget.test.ts— 27/27 passed (21 existing + 6 new)npx vitest run packages/agent-core/test/utils/compaction-overflow-verification.test.ts— 3/3 passednpx oxlint packages/agent-core/src/agent/compaction/full.ts packages/agent-core/test/utils/completion-budget.test.ts packages/agent-core/test/utils/compaction-overflow-verification.test.ts --type-aware— 0 warnings, 0 errorsOverflow verification: before vs after fix
The verification test simulates the budget resolution logic from the original and patched
compactionRound()using three real model configurations. With a typical compaction input of ~80K tokens:Key finding: the original code overflows for ALL three models — even when
maxOutputSizeis explicitly configured (glm-5.2), because the compaction path never passes it toresolveCompletionBudget.Reproducible error example
Using
stepfun/step-3.7-flash(max_context=256K, no maxOutputSize configured) with a conversation large enough to trigger compaction:With this fix,
max_completion_tokensis capped at 8,192 instead of 256,000, so80,000 + 8,192 = 88,192stays well within the 256K window.Relationship to PR #482
This PR is complementary to #482:
maxOutputSizetoresolveCompletionBudget(necessary)maxOutputSizeisundefined(covers the remaining gap)Both PRs can coexist — if #482 merges first, the fallback cap in this PR is still independently valuable. If this PR merges first, #482 becomes a no-op.