Skip to content

feat(engine): add type: wait step for in-process pauses#224

Open
jrob5756 wants to merge 3 commits into
mainfrom
feat/218-wait-step-type
Open

feat(engine): add type: wait step for in-process pauses#224
jrob5756 wants to merge 3 commits into
mainfrom
feat/218-wait-step-type

Conversation

@jrob5756
Copy link
Copy Markdown
Collaborator

Closes #218.

Summary

Adds a new type: wait step that pauses workflow execution for a parsed duration via asyncio.sleep. Pure in-process, cross-platform, no shell sleep dependency. Composes with routing loop-backs for polling, rate-limit cooldowns, demos, and external-system catch-up.

agents:
  - name: cooldown
    type: wait
    duration: 60s              # int/float seconds, "Ns/Nm/Nh/Nms", or Jinja2
    reason: "Cooling down"     # optional, shown in dashboard
    routes:
      - to: next_step

Behavior

  • Duration: accepts plain int/float seconds, suffixed strings (ms/s/m/h), and Jinja2 templates. workflow.input.* available without explicit input: declaration (wait joins script/workflow in _LOCAL_RENDER_AGENT_TYPES).
  • Bounds: 0 < duration ≤ 24h. Templated values defer literal validation to runtime.
  • Booleans rejected pre-coercion via a mode="before" field validator — Pydantic v2 would otherwise silently accept True as int 1.
  • Esc/Ctrl+G cancellation: sleep races against the engine's interrupt_event. When the interrupt wins, the wait returns early with interrupted=True without clearing the event, so the engine's between-step _check_interrupt triggers the normal interrupt menu.
  • Workflow timeout: limits.timeout_seconds cancels in-flight waits via LimitEnforcer.wait_for_with_timeout (same wrapping used for scripts).
  • Iteration counting: wait steps count toward limits.max_iterations (step counter). max_agent_iterations (per-LLM-agent tool counter) is not applicable.
  • Public output contract (per the issue): strictly {"waited_seconds": float} in workflow context. Extra metadata (requested_seconds, reason, interrupted) lives only in event payloads.

Events

  • Existing generic agent_started fires before type dispatch with agent_type: "wait", so dashboards keyed on agent lifecycle work unchanged.
  • Type-specific wait_started/wait_completed/wait_failed carry the additional fields (mirrors the existing script convention exactly).
  • Resume replay extends _synth_agent_or_script with a wait branch so checkpointed wait steps round-trip cleanly.

Validation

  • Rejects wait inside parallel groups and as for_each inline agents.
  • Forbids 22 incompatible fields on wait; forbids duration/reason on all non-wait types.

Dashboard

  • New 'wait' NodeType with WaitNode (Clock icon) + WaitDetail component.
  • graph-layout.ts maps waitwaitNode; node is registered in WorkflowGraph nodeTypes.
  • Store carries duration_seconds, waited_seconds, requested_seconds, reason, interrupted on NodeData.
  • Activity log renders Waiting Xs — reason and Wait completed (Xs) — interrupted entries.

Example

examples/wait-step.yaml — polling pattern with templated poll interval and route loop-back.

Tests

  • tests/test_engine/test_duration.py — pure parser (26 cases).
  • tests/test_config/test_wait_schema.py — schema validation (41 cases: valid forms, required duration, boolean rejection, bounds, every forbidden field, parallel/for_each rejection, wait-only fields on other types).
  • tests/test_executor/test_wait.py — sleep accuracy, interrupt cancellation (verifies the event stays set for engine consumption), templated durations, runtime validation (11 cases).
  • tests/test_engine/test_wait_workflow.py — end-to-end (6 cases): linear wait, strict output contract, workflow-timeout cancellation, event-emission shape, templated duration from workflow input, interrupt-event cancels in-flight sleep.

Validation results

  • make lint — clean.
  • make typecheck — clean (modulo a pre-existing dialog_evaluator.py warning on origin/main).
  • uv run pytest tests/test_config tests/test_engine tests/test_executor tests/test_cli tests/test_web tests/test_integration tests/test_gates2124 passed, 16 skipped (84 of those are new wait tests).
  • make validate-examples — all examples validate including the new wait-step.yaml.
  • Frontend npm run build — clean (TS compile + Vite build).
  • Manual smoke: ran the example with duration: 250ms end-to-end; verbose output shows clean Wait: 0.25s — smoke test then Wait done: pause after 0.25s with {"slept": 0.251…}.

Acceptance criteria — all met

  • type: wait accepted with templated duration and optional reason
  • Duration parser accepts ms/s/m/h suffixes and plain numbers
  • Schema rejects durations > 24h and ≤ 0 (and boolean)
  • Pauses via asyncio.sleep
  • Esc/Ctrl+G cancels in-progress wait immediately
  • Workflow timeout cancels in-progress wait
  • Output is {"waited_seconds": float} available to routes / downstream steps
  • Example under examples/ (polling with route loop-back)
  • Tests for fixed sleep, templated sleep, interrupt cancellation, timeout cancellation, schema rejection of invalid durations

jrob5756 and others added 3 commits May 21, 2026 15:02
A new step type that pauses workflow execution for a parsed duration
via asyncio.sleep. Cross-platform, no shell dependency. Composes with
routing loop-backs for polling, rate-limit cooldowns, and demos.

YAML:

  agents:
    - name: cooldown
      type: wait
      duration: 60s             # int/float seconds, "Ns/Nm/Nh/Nms", or Jinja2
      reason: "Cooling down"    # optional, shown in dashboard
      routes:
        - to: next_step

Behavior:
- Duration accepts plain numbers, suffixed strings (ms/s/m/h), and
  Jinja2 templates. Workflow.input.* is available without an explicit
  input: declaration (wait joins script/workflow in
  _LOCAL_RENDER_AGENT_TYPES).
- Schema enforces 0 < duration <= 24h and rejects boolean durations
  pre-coercion via a mode="before" field validator (Pydantic v2 would
  otherwise accept True as int 1).
- Sleep races against the engine's interrupt_event so Esc/Ctrl+G
  cancels in-flight waits immediately. The event is NOT cleared by
  the executor — the engine's between-step _check_interrupt consumes
  it so the user still gets the normal interrupt menu.
- Workflow-level limits.timeout_seconds cancels long waits via
  LimitEnforcer.wait_for_with_timeout (same wrapping used by scripts).
- Wait counts toward limits.max_iterations (step counter) but is not
  subject to max_agent_iterations (per-LLM-agent tool counter — N/A).
- Public output contract is strictly {"waited_seconds": float} per
  the issue spec. Extra metadata (requested_seconds, reason,
  interrupted) lives in event payloads only.

Events:
- The existing generic agent_started fires before type dispatch with
  agent_type: "wait", so dashboards keyed on agent lifecycle pick it
  up. Type-specific wait_started / wait_completed / wait_failed
  carry the additional fields (mirrors the script convention).
- Resume replay extends _synth_agent_or_script with a wait branch
  so checkpointed wait steps round-trip cleanly.

Validation:
- Rejects wait inside parallel groups and as for_each inline agents
  (consistent with script).
- Forbids 22 incompatible fields on wait (prompt, model, command,
  args, env, working_dir, tools, options, workflow, retry, dialog,
  reasoning, timeout, timeout_seconds, max_session_seconds,
  max_agent_iterations, max_depth, input_mapping, system_prompt,
  provider, output).
- Forbids duration/reason on all non-wait types.

Dashboard:
- New 'wait' NodeType with WaitNode (Clock icon) and WaitDetail
  component. Graph layout maps wait -> waitNode. Workflow store
  carries duration_seconds, waited_seconds, requested_seconds,
  reason, and interrupted on NodeData. Activity log renders
  "Waiting Xs — reason" and "Wait completed (Xs) — interrupted".

Documentation:
- New examples/wait-step.yaml demonstrates a polling pattern with
  templated poll interval and route loop-back.

Tests:
- tests/test_engine/test_duration.py: pure parser (26 cases).
- tests/test_config/test_wait_schema.py: schema validation (41 cases:
  valid forms, required duration, bool rejection, bounds, forbidden
  fields, parallel/for_each rejection, wait-only fields on other
  types).
- tests/test_executor/test_wait.py: sleep accuracy, interrupt
  cancellation (verifies event stays set for engine consumption),
  templated durations, runtime validation (11 cases).
- tests/test_engine/test_wait_workflow.py: end-to-end (6 cases) —
  linear wait, strict output contract, workflow timeout cancellation,
  event emission shape, templated duration from workflow input,
  interrupt-event cancels in-flight sleep.

Closes #218

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follow-up to feat(engine): add type: wait step (#218) addressing
findings from the pr-review-toolkit pass.

Code fixes:
- schema.py: drop the duplicate isinstance(value, bool) guard inside
  _validate_wait_duration. The reject_bool_duration field validator
  (mode="before") already catches booleans pre-coercion in normal
  construction; the duplicate was unreachable. Replace with a docstring
  note pointing at the field validator. (Flagged by dead-code-finder,
  type-design-analyzer, and comment-analyzer.)
- engine/workflow.py wait dispatch preview block:
  * Replace the redundant `except (ValueError, Exception)` tuple with a
    bare `except Exception` (ValueError is a subclass).
  * Add a `logger.debug` line on preview render failure so the path is
    no longer silent.
  * For preview_reason, fall back to None instead of the raw template
    string — the dashboard previously displayed literal Jinja markup
    like `{{ x }}` until wait_failed fired; None is the correct "absent"
    signal. (Flagged by silent-failure-hunter and code-reviewer.)
- executor/wait.py _sleep_with_interrupt cleanup: narrow
  `contextlib.suppress(CancelledError, Exception)` to `suppress(CancelledError)`
  to match the success-path cleanup. Genuine errors from future, non-
  trivial awaitables should not be silently swallowed during cleanup.
  (Flagged by silent-failure-hunter.)

Test additions / improvements:
- test_web/test_server.py: new test_emits_wait_events_for_wait_type
  pins the _synth_agent_or_script wait branch contract (event names,
  synthetic flag, duration_seconds/reason/interrupted fields). This
  was the largest coverage gap — resume replay had no test.
- test_engine/test_wait_workflow.py: new
  test_emits_wait_failed_on_runtime_validation verifies a wait_failed
  event is emitted with error_type="ValidationError" and the expected
  message before the exception unwinds. Closes the gap where a
  refactor dropping the try/except in the dispatch branch would leave
  the dashboard with a hanging "started but never completed" node.
- test_executor/test_wait.py + test_engine/test_wait_workflow.py:
  loosen wall-clock lower bounds (>= 0.09 / >= 0.04 / >= 0.01 / etc.)
  that risked CI flakes on loaded runners. The executor's contract is
  "sleep at least roughly this long unless interrupted", and the
  interrupted=False check is the real invariant — the precise
  elapsed value is not what these tests are pinning.

All 1463 tests in tests/test_config + tests/test_engine + tests/test_executor
+ tests/test_web pass. `make lint` is clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ences

Follow-up to PR #224 — propagate the new `type: wait` step into all
user-facing documentation and the bundled conductor skill so authors
and AI assistants discover it.

User docs:
- docs/workflow-syntax.md: add full Wait Steps section after Script
  Steps (duration formats, output contract, polling loop-back pattern,
  cancellation semantics, restrictions, example link). Update the
  type literal comment to include `wait`. Note that wait steps count
  toward `max_iterations`. Update the dialog-mode forbidden-type list.
- docs/configuration.md: add `wait` to the list of agent types where
  `reasoning.effort` is rejected (don't call a model).
- README.md: add wait-step.yaml to the examples table.
- AGENTS.md: document `executor/wait.py` and the new top-level
  `conductor/duration.py` helper module. Update the workflow execution
  flow steps to include wait.
- examples/README.md: add a "Step Types" section documenting both
  script-step.yaml (previously undocumented) and the new
  wait-step.yaml.

Skill references (plugins/conductor/skills/conductor/):
- SKILL.md: add wait entry to the Key Concepts table.
- references/yaml-schema.md: extend the type-literal enumeration; add
  a full Wait Agent Schema section (duration format, output, full
  forbidden-field list, cancellation); update parallel-group and
  for-each validation rules to mention wait; extend the
  reasoning/retry/timeout_seconds forbidden-type comments.
- references/authoring.md: extend the type-literal comment; add a
  full Wait Steps (`type: wait`) section (duration format, output,
  polling loop-back pattern, cancellation, restrictions); extend the
  reasoning/retry/timeout_seconds/dialog forbidden-type lists.

CHANGELOG.md:
- Add an Unreleased entry summarizing the wait step feature, with
  PR/issue references.

Verification:
- `make validate-examples` passes (including wait-step.yaml).
- All cross-references to step-type restrictions are now consistent
  across user docs and skill references.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a type: wait step that pauses workflow execution for a duration

1 participant