feat(engine): add type: wait step for in-process pauses#224
Open
jrob5756 wants to merge 3 commits into
Open
Conversation
A new step type that pauses workflow execution for a parsed duration
via asyncio.sleep. Cross-platform, no shell dependency. Composes with
routing loop-backs for polling, rate-limit cooldowns, and demos.
YAML:
agents:
- name: cooldown
type: wait
duration: 60s # int/float seconds, "Ns/Nm/Nh/Nms", or Jinja2
reason: "Cooling down" # optional, shown in dashboard
routes:
- to: next_step
Behavior:
- Duration accepts plain numbers, suffixed strings (ms/s/m/h), and
Jinja2 templates. Workflow.input.* is available without an explicit
input: declaration (wait joins script/workflow in
_LOCAL_RENDER_AGENT_TYPES).
- Schema enforces 0 < duration <= 24h and rejects boolean durations
pre-coercion via a mode="before" field validator (Pydantic v2 would
otherwise accept True as int 1).
- Sleep races against the engine's interrupt_event so Esc/Ctrl+G
cancels in-flight waits immediately. The event is NOT cleared by
the executor — the engine's between-step _check_interrupt consumes
it so the user still gets the normal interrupt menu.
- Workflow-level limits.timeout_seconds cancels long waits via
LimitEnforcer.wait_for_with_timeout (same wrapping used by scripts).
- Wait counts toward limits.max_iterations (step counter) but is not
subject to max_agent_iterations (per-LLM-agent tool counter — N/A).
- Public output contract is strictly {"waited_seconds": float} per
the issue spec. Extra metadata (requested_seconds, reason,
interrupted) lives in event payloads only.
Events:
- The existing generic agent_started fires before type dispatch with
agent_type: "wait", so dashboards keyed on agent lifecycle pick it
up. Type-specific wait_started / wait_completed / wait_failed
carry the additional fields (mirrors the script convention).
- Resume replay extends _synth_agent_or_script with a wait branch
so checkpointed wait steps round-trip cleanly.
Validation:
- Rejects wait inside parallel groups and as for_each inline agents
(consistent with script).
- Forbids 22 incompatible fields on wait (prompt, model, command,
args, env, working_dir, tools, options, workflow, retry, dialog,
reasoning, timeout, timeout_seconds, max_session_seconds,
max_agent_iterations, max_depth, input_mapping, system_prompt,
provider, output).
- Forbids duration/reason on all non-wait types.
Dashboard:
- New 'wait' NodeType with WaitNode (Clock icon) and WaitDetail
component. Graph layout maps wait -> waitNode. Workflow store
carries duration_seconds, waited_seconds, requested_seconds,
reason, and interrupted on NodeData. Activity log renders
"Waiting Xs — reason" and "Wait completed (Xs) — interrupted".
Documentation:
- New examples/wait-step.yaml demonstrates a polling pattern with
templated poll interval and route loop-back.
Tests:
- tests/test_engine/test_duration.py: pure parser (26 cases).
- tests/test_config/test_wait_schema.py: schema validation (41 cases:
valid forms, required duration, bool rejection, bounds, forbidden
fields, parallel/for_each rejection, wait-only fields on other
types).
- tests/test_executor/test_wait.py: sleep accuracy, interrupt
cancellation (verifies event stays set for engine consumption),
templated durations, runtime validation (11 cases).
- tests/test_engine/test_wait_workflow.py: end-to-end (6 cases) —
linear wait, strict output contract, workflow timeout cancellation,
event emission shape, templated duration from workflow input,
interrupt-event cancels in-flight sleep.
Closes #218
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follow-up to feat(engine): add type: wait step (#218) addressing findings from the pr-review-toolkit pass. Code fixes: - schema.py: drop the duplicate isinstance(value, bool) guard inside _validate_wait_duration. The reject_bool_duration field validator (mode="before") already catches booleans pre-coercion in normal construction; the duplicate was unreachable. Replace with a docstring note pointing at the field validator. (Flagged by dead-code-finder, type-design-analyzer, and comment-analyzer.) - engine/workflow.py wait dispatch preview block: * Replace the redundant `except (ValueError, Exception)` tuple with a bare `except Exception` (ValueError is a subclass). * Add a `logger.debug` line on preview render failure so the path is no longer silent. * For preview_reason, fall back to None instead of the raw template string — the dashboard previously displayed literal Jinja markup like `{{ x }}` until wait_failed fired; None is the correct "absent" signal. (Flagged by silent-failure-hunter and code-reviewer.) - executor/wait.py _sleep_with_interrupt cleanup: narrow `contextlib.suppress(CancelledError, Exception)` to `suppress(CancelledError)` to match the success-path cleanup. Genuine errors from future, non- trivial awaitables should not be silently swallowed during cleanup. (Flagged by silent-failure-hunter.) Test additions / improvements: - test_web/test_server.py: new test_emits_wait_events_for_wait_type pins the _synth_agent_or_script wait branch contract (event names, synthetic flag, duration_seconds/reason/interrupted fields). This was the largest coverage gap — resume replay had no test. - test_engine/test_wait_workflow.py: new test_emits_wait_failed_on_runtime_validation verifies a wait_failed event is emitted with error_type="ValidationError" and the expected message before the exception unwinds. Closes the gap where a refactor dropping the try/except in the dispatch branch would leave the dashboard with a hanging "started but never completed" node. - test_executor/test_wait.py + test_engine/test_wait_workflow.py: loosen wall-clock lower bounds (>= 0.09 / >= 0.04 / >= 0.01 / etc.) that risked CI flakes on loaded runners. The executor's contract is "sleep at least roughly this long unless interrupted", and the interrupted=False check is the real invariant — the precise elapsed value is not what these tests are pinning. All 1463 tests in tests/test_config + tests/test_engine + tests/test_executor + tests/test_web pass. `make lint` is clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ences Follow-up to PR #224 — propagate the new `type: wait` step into all user-facing documentation and the bundled conductor skill so authors and AI assistants discover it. User docs: - docs/workflow-syntax.md: add full Wait Steps section after Script Steps (duration formats, output contract, polling loop-back pattern, cancellation semantics, restrictions, example link). Update the type literal comment to include `wait`. Note that wait steps count toward `max_iterations`. Update the dialog-mode forbidden-type list. - docs/configuration.md: add `wait` to the list of agent types where `reasoning.effort` is rejected (don't call a model). - README.md: add wait-step.yaml to the examples table. - AGENTS.md: document `executor/wait.py` and the new top-level `conductor/duration.py` helper module. Update the workflow execution flow steps to include wait. - examples/README.md: add a "Step Types" section documenting both script-step.yaml (previously undocumented) and the new wait-step.yaml. Skill references (plugins/conductor/skills/conductor/): - SKILL.md: add wait entry to the Key Concepts table. - references/yaml-schema.md: extend the type-literal enumeration; add a full Wait Agent Schema section (duration format, output, full forbidden-field list, cancellation); update parallel-group and for-each validation rules to mention wait; extend the reasoning/retry/timeout_seconds forbidden-type comments. - references/authoring.md: extend the type-literal comment; add a full Wait Steps (`type: wait`) section (duration format, output, polling loop-back pattern, cancellation, restrictions); extend the reasoning/retry/timeout_seconds/dialog forbidden-type lists. CHANGELOG.md: - Add an Unreleased entry summarizing the wait step feature, with PR/issue references. Verification: - `make validate-examples` passes (including wait-step.yaml). - All cross-references to step-type restrictions are now consistent across user docs and skill references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #218.
Summary
Adds a new
type: waitstep that pauses workflow execution for a parsed duration viaasyncio.sleep. Pure in-process, cross-platform, no shellsleepdependency. Composes with routing loop-backs for polling, rate-limit cooldowns, demos, and external-system catch-up.Behavior
int/floatseconds, suffixed strings (ms/s/m/h), and Jinja2 templates.workflow.input.*available without explicitinput:declaration (wait joinsscript/workflowin_LOCAL_RENDER_AGENT_TYPES).0 < duration ≤ 24h. Templated values defer literal validation to runtime.mode="before"field validator — Pydantic v2 would otherwise silently acceptTrueasint 1.interrupt_event. When the interrupt wins, the wait returns early withinterrupted=Truewithout clearing the event, so the engine's between-step_check_interrupttriggers the normal interrupt menu.limits.timeout_secondscancels in-flight waits viaLimitEnforcer.wait_for_with_timeout(same wrapping used for scripts).limits.max_iterations(step counter).max_agent_iterations(per-LLM-agent tool counter) is not applicable.{"waited_seconds": float}in workflow context. Extra metadata (requested_seconds,reason,interrupted) lives only in event payloads.Events
agent_startedfires before type dispatch withagent_type: "wait", so dashboards keyed on agent lifecycle work unchanged.wait_started/wait_completed/wait_failedcarry the additional fields (mirrors the existing script convention exactly)._synth_agent_or_scriptwith a wait branch so checkpointed wait steps round-trip cleanly.Validation
for_eachinline agents.duration/reasonon all non-wait types.Dashboard
'wait'NodeTypewithWaitNode(Clock icon) +WaitDetailcomponent.graph-layout.tsmapswait→waitNode; node is registered inWorkflowGraphnodeTypes.duration_seconds,waited_seconds,requested_seconds,reason,interruptedonNodeData.Waiting Xs — reasonandWait completed (Xs) — interruptedentries.Example
examples/wait-step.yaml— polling pattern with templated poll interval and route loop-back.Tests
tests/test_engine/test_duration.py— pure parser (26 cases).tests/test_config/test_wait_schema.py— schema validation (41 cases: valid forms, requiredduration, boolean rejection, bounds, every forbidden field, parallel/for_eachrejection, wait-only fields on other types).tests/test_executor/test_wait.py— sleep accuracy, interrupt cancellation (verifies the event stays set for engine consumption), templated durations, runtime validation (11 cases).tests/test_engine/test_wait_workflow.py— end-to-end (6 cases): linear wait, strict output contract, workflow-timeout cancellation, event-emission shape, templated duration from workflow input, interrupt-event cancels in-flight sleep.Validation results
make lint— clean.make typecheck— clean (modulo a pre-existingdialog_evaluator.pywarning onorigin/main).uv run pytest tests/test_config tests/test_engine tests/test_executor tests/test_cli tests/test_web tests/test_integration tests/test_gates— 2124 passed, 16 skipped (84 of those are new wait tests).make validate-examples— all examples validate including the newwait-step.yaml.npm run build— clean (TS compile + Vite build).duration: 250msend-to-end; verbose output shows cleanWait: 0.25s — smoke testthenWait done: pause after 0.25swith{"slept": 0.251…}.Acceptance criteria — all met
type: waitaccepted with templateddurationand optionalreasonms/s/m/hsuffixes and plain numbersasyncio.sleep{"waited_seconds": float}available to routes / downstream stepsexamples/(polling with route loop-back)