Skip to content

CLAUDE_AUTOCOMPACT_PCT_OVERRIDE leaks into Claude subprocess when autocompact_pct=0 #394

@dcellison

Description

@dcellison

Problem

ClaudeCodeBackend._ensure_started builds the subprocess environment as env = os.environ.copy(), then conditionally adds CLAUDE_AUTOCOMPACT_PCT_OVERRIDE only when self.autocompact_pct > 0. There is no symmetric else: env.pop(...) to ensure the var is absent when autocompact_pct == 0. If the parent process environment already has CLAUDE_AUTOCOMPACT_PCT_OVERRIDE set for any reason (a profile dotfile, an env file the daemon loaded via load_dotenv, a launchd plist's EnvironmentVariables, or a manual export in the shell that started the daemon), the var leaks through unconditionally.

The contract documented at the call site reads "set autocompact threshold so Claude compacts earlier... if self.autocompact_pct > 0" but the implementation does not enforce the disable side of that contract.

Production impact

An operator who sets claude_autocompact_pct=0 (default) intending to disable autocompact, but who has CLAUDE_AUTOCOMPACT_PCT_OVERRIDE set somewhere reachable from the daemon's os.environ, will see autocompact fire at the leaked value. The Config field appears to be ignored. There is no log entry that would expose the divergence; the operator has to inspect the spawned subprocess's env to discover it.

Code site

src/kai/claude.py around the env = os.environ.copy() line in _ensure_started. The relevant block is the if self.autocompact_pct > 0: branch that sets env["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = str(self.autocompact_pct) with no symmetric else.

Test failure as canary

tests/test_claude.py::test_no_autocompact_env_when_zero constructs a ClaudeCodeBackend(autocompact_pct=0), drives _ensure_started, and asserts "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE" not in env. The assertion holds in CI (clean container env) and in interactive shells where nothing exports the var. It fails in any process tree where the parent has the var set.

This was surfaced when one of the project's reviewers ran make test from a context where the var was inherited from the parent environment. The same context is reachable in production every time the daemon spawns a Claude subprocess: if /etc/kai/env (or any other source the daemon's load_dotenv reads) sets the OVERRIDE-suffixed var, every subprocess inherits it, and autocompact_pct=0 is silently ignored.

Fix

Symmetric set-or-pop in claude.py:

if self.autocompact_pct > 0:
    env["CLAUDE_AUTOCOMPACT_PCT_OVERRIDE"] = str(self.autocompact_pct)
else:
    env.pop("CLAUDE_AUTOCOMPACT_PCT_OVERRIDE", None)

This makes the contract honest: autocompact_pct=N for N>0 sets the var to N; autocompact_pct=0 ensures absence regardless of what the parent env had.

The same audit shape applies to any other Config-to-env-var mapping in claude.py. A grep for if self.<some_field> followed by env[...] = ... should walk every site to confirm none of them have the same asymmetry.

Test-side defense (secondary)

Adding monkeypatch.delenv("CLAUDE_AUTOCOMPACT_PCT_OVERRIDE", raising=False) in the relevant test setup would defend against this class of bug regardless of the production fix. Worth doing as belt-and-suspenders so a future regression in the production code surfaces under any developer's local env, not just CI's pristine env. The same monkeypatch sweep should cover sibling tests in TestCommandConstruction that build env dicts from os.environ.

Acceptance

  1. claude.py has the symmetric set-or-pop pattern for CLAUDE_AUTOCOMPACT_PCT_OVERRIDE.
  2. test_no_autocompact_env_when_zero passes regardless of the parent env (test grows a defensive monkeypatch.delenv).
  3. A grep audit of other Config-to-env-var mappings in claude.py confirms no other site has the same asymmetric leak.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions