Fix CLI rendering corruption and split CLI/frontend model defaults#121
Conversation
The interactive CLI was interleaving live sub-agent redraws with streamed markdown output, which corrupted ANSI rendering and leaked raw control sequences into the terminal. The CLI and web app also shared one default model config even though they need different Anthropic routing defaults. Constraint: CLI default must use direct Anthropic credentials while web sessions must default to Bedrock Anthropic Constraint: Interactive terminal output must remain readable while sub-agent progress is live Rejected: Single shared config file with runtime overrides | keeps ownership of defaults implicit across surfaces Rejected: Keep background redraw ticker | concurrent terminal writers still corrupt streamed output Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep CLI and frontend default models in separate config files unless both surfaces intentionally converge again Tested: python -m compileall agent backend Tested: ./frontend/node_modules/.bin/tsc -p frontend/tsconfig.json --noEmit Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --with pytest python -m pytest -q tests/unit/test_cli_rendering.py Not-tested: Full pytest suite (blocked by pre-existing tests/unit/test_llm_error_classification.py import error during collection)
|
Claude finished @akseljoonas's task in 2m 48s —— View job PR Review
|
The earlier PR fixed the CLI rendering and model-default split, but the full local suite exposed additional regressions in tool-result patching, doom-loop polling detection, sandbox reuse messaging, and async test support. This follow-up commit restores the missing helpers and updates those production paths so the new regression tests pass for real. Constraint: Provider message histories must keep tool_use/tool_result pairing valid across interrupted turns Constraint: Legitimate polling with changing results must not trip doom-loop recovery Rejected: Only fix the original collection blocker | leaves the full suite red and the PR note stale Rejected: Silence the failing tests without restoring runtime helpers | would hide real production regressions Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep the local regression tests in sync with the production recovery paths they exercise Tested: python -m compileall agent/context_manager/manager.py agent/core/agent_loop.py agent/core/doom_loop.py agent/tools/sandbox_tool.py backend/user_quotas.py Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --with pytest --with pytest-asyncio python -m pytest -q tests/unit/test_dangling_tool_calls.py tests/unit/test_doom_loop_polling.py tests/unit/test_sandbox_already_active_message.py tests/unit/test_user_quotas.py Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --with pytest --with pytest-asyncio python -m pytest -q Not-tested: Remote CI environment parity
|
Claude finished @akseljoonas's task in 3m 46s —— View job Claude finished @akseljoonas's task —— Re-review (new commit:
|
The review was right about two follow-up issues: the old shared config file was still present after the CLI/frontend split, and the Bedrock rate-limit retry schedule still had a dead third entry because the loop only ever consumed two retry delays. This commit removes the orphaned config and makes the rate-limit schedule line up with the actual retry budget. Constraint: Retry budget for Bedrock token throttling must exceed the provider's ~60s bucket recovery window in the retries that actually run Rejected: Keep a third delay entry in the schedule | the current retry loop never reaches it Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep retry schedules aligned with the retry loop's real number of sleeps, not the raw retry constant count Tested: python -m compileall agent/core/agent_loop.py tests/unit/test_llm_error_classification.py Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --with pytest --with pytest-asyncio python -m pytest -q tests/unit/test_llm_error_classification.py Tested: UV_CACHE_DIR=/tmp/uv-cache uv run --with pytest --with pytest-asyncio python -m pytest -q Not-tested: Remote CI environment parity
|
Validated the remaining review points before changing code:
Local verification after the update: |
|
Claude finished @akseljoonas's task in 1m 3s —— View job Re-review (new commit:
|

Summary
Testing
python -m compileall agent backend./frontend/node_modules/.bin/tsc -p frontend/tsconfig.json --noEmitUV_CACHE_DIR=/tmp/uv-cache uv run --with pytest --with pytest-asyncio python -m pytest -q