Increase default max-tokens from 1000 to 8192 by alexkroman · Pull Request #204 · AssemblyAI/cli

alexkroman · 2026-06-16T23:48:47Z

Raises the default maximum tokens per LLM reply from 1000 to 8192 across all commands that support the --max-tokens option.

Summary

This change increases the default token ceiling to prevent long reduces/summaries from being clipped mid-sentence. Since the LLM Gateway only bills for tokens actually generated, a higher cap has no cost impact on short replies while providing more headroom for longer outputs. Users can still override this per-call with the --max-tokens flag.

Changes

Updated DEFAULT_MAX_TOKENS in aai_cli/core/llm.py from 1000 to 8192
Added explanatory comment clarifying the rationale: generous ceiling prevents clipping, gateway bills only actual tokens, and per-call override is available
Updated all help text snapshots across affected commands (run, history, and others) to reflect the new default value

Implementation Details

The change is centralized in a single constant definition, with all command help text automatically reflecting the new default through the snapshot tests. No functional logic changes were needed—this is purely a configuration adjustment.

https://claude.ai/code/session_01Y8Qzjnepp1yyViyopgeVYq

Long --llm-reduce summaries and other LLM-Gateway replies were being clipped at ~1000 tokens. Bump the shared DEFAULT_MAX_TOKENS ceiling so multi-source reduces and summaries finish instead of cutting off mid-sentence. The gateway only bills tokens actually generated, so a higher cap is free on short replies, and --max-tokens still overrides per call. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Y8Qzjnepp1yyViyopgeVYq

alexkroman enabled auto-merge June 16, 2026 23:48

alexkroman added this pull request to the merge queue Jun 16, 2026

Merged via the queue into main with commit ff4ea78 Jun 16, 2026
19 checks passed

alexkroman deleted the claude/peaceful-feynman-wo29g2 branch June 16, 2026 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase default max-tokens from 1000 to 8192#204

Increase default max-tokens from 1000 to 8192#204
alexkroman merged 1 commit into
mainfrom
claude/peaceful-feynman-wo29g2

alexkroman commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexkroman commented Jun 16, 2026

Summary

Changes

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants