Skip to content

fix(idle-detector): 500ms grace timer after readyPattern, immune to TUI redraws#224

Open
flipped0119 wants to merge 1 commit into
deepcoldy:masterfrom
flipped0119:fix/idle-ready-grace-timer
Open

fix(idle-detector): 500ms grace timer after readyPattern, immune to TUI redraws#224
flipped0119 wants to merge 1 commit into
deepcoldy:masterfrom
flipped0119:fix/idle-ready-grace-timer

Conversation

@flipped0119

Copy link
Copy Markdown
Contributor

Problem

When a CLI's readyPattern fires (input prompt visible), IdleDetector previously fell back to the full 2s QUIESCENCE_MS wait. Hermes and Codex TUIs redraw their status bars every ~1ms — clearing the quiescence timer on every feed() — so idle was never declared until the redraws stopped, adding ~13s to the detection latency on long sessions.

Fix

Add a READY_GRACE_MS = 500 grace window that:

  • Arms once when readyPattern first matches
  • Is NOT reset by subsequent feed() calls (TUI status-bar churn)
  • Triggers markIdle() after 500ms — matches completionPattern's 500ms

Spurious spinners in the post-ready output are already blocked by the existing !readySeen guard at the top of feed(); we don't re-check them here.

Spike (vs. master)

Scenario Before After
Full startup banner 3001ms 1500ms
Minimal 2202ms 700ms
+ 6s of redraws 8001ms 600ms
+ 11s of 1ms redraws 13099ms 600ms

22x speedup on the worst case.

Tests

Added 4 unit tests in test/idle-detector.test.ts:

  • arms grace timer once when readyPattern matches
  • subsequent feed() calls do NOT reset the timer
  • markIdle() fires after 500ms
  • no interaction with completionPattern

48/48 tests pass on test/idle-detector.test.ts; full suite remains green.

Scope

This PR is scoped to IdleDetector only. The Hermes adapter's missing readyPattern (which masks the same symptom for the Hermes CLI specifically) is fixed separately in PR #223.

…UI redraws

When a CLI's readyPattern fires (input prompt visible), IdleDetector previously
fell back to the full 2s QUIESCENCE_MS wait. Hermes and Codex TUIs redraw
their status bars every ~1ms — clearing the quiescence timer on every feed()
— so idle was never declared until the redraws stopped, adding ~13s to the
detection latency on long sessions.

Add READY_GRACE_MS = 500 grace window that:
  * Arms once when readyPattern first matches
  * Is NOT reset by subsequent feed() calls (TUI status-bar churn)
  * Triggers markIdle() after 500ms — matches completionPattern's 500ms

Spurious spinners in the post-ready output are already blocked by the
existing `!readySeen` guard at the top of feed(); we don't re-check here.

Spike: full startup banner 3001ms → 1500ms; 11s of 1ms-spaced redraws
13099ms → 600ms (22x speedup).

Adds 4 unit tests covering: arm-once, no-reset-by-feed, markIdle fires
after 500ms, no interaction with completionPattern.
@flipped0119 flipped0119 requested a review from deepcoldy as a code owner June 16, 2026 05:28
@deepcoldy

Copy link
Copy Markdown
Owner

@flipped0119 先谢谢这个 PR,思路和测试都很清楚 🙏 提速本身是真需求。但我在 review 时有一个收益/风险错配的担忧想跟你对齐一下,也想了解你那边的实际观测,再决定怎么合。

我的担忧

新的 grace 路径是「readyPattern 命中后 arm 一次 500ms、后续 feed 不重置、到点直接 markIdle()」。它有两个特点:

  1. 不重置 → 免疫状态栏churn(这是你要的);
  2. 但也因此无法区分「真闲着时的状态栏自重绘」和「还在干活时、提示符/状态栏也同时显示着的持续输出」。而且这条路径直接 markIdle(),绕过了 SPINNER_GUARD_MS(3s) 守卫readySeen 之后 spinner 又不再被 track)。

具体到受影响的 CLI(有 readyPattern、无 completionPattern):

  • codex 的 readyPattern = /›|\d+% left/traex = /[›❯]|\d+% left/,其中 \d+% left 是上下文窗口状态栏,干活时也常驻显示reset() 之后一轮里 readySeen 会在干活早期就置 true → grace 一旦 arm,500ms 后无视后续输出就判空闲。
  • 老逻辑在这个场景是安全的:readySeen 后走 2s「重置式」quiescence,持续输出会一直把它推后,干活期间不会误判。新逻辑把它变成「readySeen 后 500ms 必空闲」。

为什么我特别在意 codex-app

onIdle → markPromptReady() 会做两件事:① 触发 bridge drain 发最终输出 ② flushPending() 把下一条排队消息打进 CLI。

  • codex/traex/coco/hermes/mtr 有 transcript 日志兜底(assistant_finalfireIdle 才是权威 idle),所以即使屏幕路径提前 markIdle,输出发射被日志兜底挡住、不会吐半成品——这部分被兜住了。
  • codex-app 不在 worker.ts 的 bridge 列表里,纯靠盯屏。它一旦提前判空闲,就会把没写完的屏幕快照当最终结果发出。换句话说:这版提速最实打实受益的是 codex-app(没日志兜底、盯屏是主力),但风险也恰好压在它身上(没有安全网)。

想请教你几个问题

  1. 你观测到 13s 的那次,具体是哪个 CLI、什么场景?是启动/首条消息那一下,还是对话进行中的后续轮次?(我的理解是痛点主要在启动,而启动时没在干活、判空闲是安全的;危险的是中途轮次。)
  2. 针对 codex-app(无日志兜底),你那边实测过它的 提示符在 agent 干活期间是否也显示吗?如果会,这版会在干活中途 ~500ms 就把半成品发出。
  3. codex/traex 的 \d+% left 常驻状态栏,你那边跑下来有没有出现过「一轮刚开始就被判空闲、下一条消息被提前打进去」的情况?

一个可能更稳的方向(供讨论)

如果痛点确实主要在启动,能不能让 grace 路径只在屏幕近端确实安静下来时才快判(比如保留对最近真实输出/spinner 的一个轻量检查),干活中途有持续输出就照常推迟?这样启动提速照拿,中途误判不发生。或者更根上的:把 codex-app 也接上 codex 的 rollout 日志兜底,盯屏快不快就无所谓了。

想先听听你的实际效果和考虑,再一起定怎么合 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants