Description
PR #3860 added timeout arms to PlanVerifier::verify, verify_plan, and replan/replan_from_plan. The timeout arms correctly increment self.consecutive_failures (fail-open policy), but they do not include the >= 3 consecutive failures escalation check that the Ok(Err(e)) arms have.
This means a misconfigured or consistently overloaded verify_provider that always times out will silently fail-open for ever — an operator will see repeated warn! entries but no error! escalation. The consecutive counter exists precisely to surface this condition.
Reproduction Steps
- Configure
verify_provider to a slow provider where every call exceeds verifier_timeout_secs.
- Execute a plan with 3+ tasks.
- Observe logs: only
warn! for each task, never error! despite 3+ consecutive failures.
Expected Behavior
After 3+ consecutive timeout-or-error failures the verifier should emit an error! log (same as the existing Ok(Err(e)) path) advising the operator to check verify_provider configuration.
Actual Behavior
Timeout arms increment consecutive_failures but never branch on the >= 3 threshold; only LLM errors trigger the escalation.
Affected Code
crates/zeph-orchestration/src/verifier.rs lines ~181–188 (verify) and ~319–326 (verify_plan).
Environment
Description
PR #3860 added timeout arms to
PlanVerifier::verify,verify_plan, andreplan/replan_from_plan. The timeout arms correctly incrementself.consecutive_failures(fail-open policy), but they do not include the>= 3 consecutive failuresescalation check that theOk(Err(e))arms have.This means a misconfigured or consistently overloaded
verify_providerthat always times out will silently fail-open for ever — an operator will see repeatedwarn!entries but noerror!escalation. The consecutive counter exists precisely to surface this condition.Reproduction Steps
verify_providerto a slow provider where every call exceedsverifier_timeout_secs.warn!for each task, nevererror!despite 3+ consecutive failures.Expected Behavior
After 3+ consecutive timeout-or-error failures the verifier should emit an
error!log (same as the existingOk(Err(e))path) advising the operator to checkverify_providerconfiguration.Actual Behavior
Timeout arms increment
consecutive_failuresbut never branch on the>= 3threshold; only LLM errors trigger the escalation.Affected Code
crates/zeph-orchestration/src/verifier.rslines ~181–188 (verify) and ~319–326 (verify_plan).Environment