fix(agent-core): cap compaction output tokens when maxOutputSize is undefined by li-xiu-qi · Pull Request #841 · MoonshotAI/kimi-code

li-xiu-qi · 2026-06-17T09:27:24Z

Related Issue

Resolve #834

Problem

The compaction worker in full.ts does not pass maxOutputSize to resolveCompletionBudget(). As a result, computeCompletionBudgetCap() always falls back to using the full max_context_tokens as max_completion_tokens — regardless of whether maxOutputSize is configured.

Since the compaction prompt itself is large (it contains the entire history being compacted), input_tokens + max_tokens exceeds the context window, and providers that do not auto-clamp max_tokens server-side reject with 400 APIContextOverflowError.

Reported in #476, #794, and #834. PR #482 partially addresses this by passing maxOutputSize, but does not cover the case where maxOutputSize is undefined (the default for most models).

What changed

Pass maxOutputSize to resolveCompletionBudget in full.ts compactionRound() — aligns with the main loop in index.ts (same as PR fix(compaction): pass maxOutputSize to resolveCompletionBudget #482)
Add a conservative fallback cap when maxOutputSize is undefined: Math.min(Math.floor(maxCtx / 4), 8192). Compaction is a summarization operation; 8192 tokens is generous for a high-quality summary while preventing overflow on any provider. This covers the gap left by PR fix(compaction): pass maxOutputSize to resolveCompletionBudget #482.
Add 6 unit tests in completion-budget.test.ts covering the compaction budget resolution scenarios
Add a verification test (compaction-overflow-verification.test.ts) that demonstrates the overflow before and after the fix using three real model configurations

Checklist

I have read the CONTRIBUTING document.
I have linked a related issue (Compaction fails with 400: maxOutputSize not passed to resolveCompletionBudget in compaction path #834).
I have added tests that prove my feature works.
Ran gen-changesets skill.
Ran gen-docs skill, or this PR needs no doc update.

Verification

Unit tests

npx vitest run packages/agent-core/test/utils/completion-budget.test.ts — 27/27 passed (21 existing + 6 new)
npx vitest run packages/agent-core/test/utils/compaction-overflow-verification.test.ts — 3/3 passed
npx oxlint packages/agent-core/src/agent/compaction/full.ts packages/agent-core/test/utils/completion-budget.test.ts packages/agent-core/test/utils/compaction-overflow-verification.test.ts --type-aware — 0 warnings, 0 errors

Overflow verification: before vs after fix

The verification test simulates the budget resolution logic from the original and patched compactionRound() using three real model configurations. With a typical compaction input of ~80K tokens:

Model	max_context	maxOutputSize	Original max_tokens	Original overflow?	Patched max_tokens	Patched overflow?
step-3.7-flash	256,000	undefined	256,000	YES ❌ (336K > 256K)	8,192	NO ✅ (88K)
kimi-for-coding	262,144	undefined	262,144	YES ❌ (342K > 262K)	8,192	NO ✅ (88K)
glm-5.2	1,000,000	131,072	1,000,000	YES ❌ (1.08M > 1M)	131,072	NO ✅ (211K)

Key finding: the original code overflows for ALL three models — even when maxOutputSize is explicitly configured (glm-5.2), because the compaction path never passes it to resolveCompletionBudget.

Reproducible error example

Using stepfun/step-3.7-flash (max_context=256K, no maxOutputSize configured) with a conversation large enough to trigger compaction:

Error: [compaction.failed] APIContextOverflowError: 400
{"detail":"{\"error\":{\"message\":\"This model's maximum context length is 256000 tokens. However, you requested 176000 output tokens and your prompt contains at least 80000 input tokens, for a total of at least 256000 tokens.\",\"type\":\"BadRequestError\",\"param\":\"input_tokens\",\"code\":400}}"}

With this fix, max_completion_tokens is capped at 8,192 instead of 256,000, so 80,000 + 8,192 = 88,192 stays well within the 256K window.

Relationship to PR #482

This PR is complementary to #482:

PR fix(compaction): pass maxOutputSize to resolveCompletionBudget #482 passes maxOutputSize to resolveCompletionBudget (necessary)
This PR adds the same fix plus a conservative fallback when maxOutputSize is undefined (covers the remaining gap)

Both PRs can coexist — if #482 merges first, the fallback cap in this PR is still independently valuable. If this PR merges first, #482 becomes a no-op.

changeset-bot · 2026-06-17T09:27:31Z

🦋 Changeset detected

Latest commit: 6402aed

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@moonshot-ai/agent-core	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

…ndefined The compaction worker in full.ts was not passing maxOutputSize to resolveCompletionBudget, causing computeCompletionBudgetCap to fall back to the full context window size as max_completion_tokens. When maxOutputSize is also undefined (the default for most models), this results in max_tokens equal to max_context_tokens, which causes APIContextOverflowError on providers that do not auto-clamp max_tokens server-side. This change: - Passes maxOutputSize to resolveCompletionBudget (aligning with the main loop in index.ts, same as PR MoonshotAI#482) - Adds a conservative fallback cap of min(maxCtx/4, 8192) when maxOutputSize is undefined, ensuring compaction never requests the full context window as output tokens - Adds tests covering the compaction budget resolution scenarios Resolve MoonshotAI#834

li-xiu-qi force-pushed the fix/compaction-output-token-cap branch from 06b785e to 388f9b2 Compare June 17, 2026 10:29

li-xiu-qi force-pushed the fix/compaction-output-token-cap branch from 388f9b2 to 6402aed Compare June 17, 2026 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent-core): cap compaction output tokens when maxOutputSize is undefined#841

fix(agent-core): cap compaction output tokens when maxOutputSize is undefined#841
li-xiu-qi wants to merge 1 commit into
MoonshotAI:mainfrom
li-xiu-qi:fix/compaction-output-token-cap

li-xiu-qi commented Jun 17, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Jun 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

li-xiu-qi commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Problem

What changed

Checklist

Verification

Unit tests

Overflow verification: before vs after fix

Reproducible error example

Relationship to PR #482

Uh oh!

changeset-bot Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

li-xiu-qi commented Jun 17, 2026 •

edited

Loading

changeset-bot Bot commented Jun 17, 2026 •

edited

Loading