Skip to content

Wolo/workflow hitl integration test#980

Draft
wolo-lab wants to merge 9 commits into
v2from
wolo/workflow_hitl_integration_test
Draft

Wolo/workflow hitl integration test#980
wolo-lab wants to merge 9 commits into
v2from
wolo/workflow_hitl_integration_test

Conversation

@wolo-lab
Copy link
Copy Markdown

@wolo-lab wolo-lab commented Jun 7, 2026

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

  • Closes: #issue_number
  • Related: #issue_number

2. Or, if no issue exists, describe the change:

If applicable, please follow the issue templates to provide as much detail as
possible.

Problem:
A clear and concise description of what the problem is.

Solution:
A clear and concise description of what you want to happen and why you choose
this solution.

Testing Plan

Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

Please include a summary of passed go test results.

Manual End-to-End (E2E) Tests:

Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context

Add any other context or screenshots about the feature request here.

wolo-lab added 2 commits June 4, 2026 20:35
…tion

Workflow-engine support for human-in-the-loop, unified on a single
mechanism — history rehydration — matching adk-python (no persisted
run-state event, no PendingRequest field).

- scheduler: per-event back-pressure handshake (a non-partial
  function-response is persisted before the node's flow rebuilds the
  next model request, fixing a non-deterministic re-issue race); pause
  a node on accumulated Event.LongRunningToolIDs (RequestInput rides on
  them); stamp NodeInfo.Path = node name on static node events so
  rehydration can attribute interrupts (dynamic children fold into
  their static ancestor).
- persistence: ReconstructRunState ports adk-python's
  _reconstruct_node_states + _infer_node_state — per-node scan
  (interrupts, resolved user responses, schemas, output), status
  inference (WAITING / PENDING+ResumedInputs re-entry /
  COMPLETED+Output handoff), backward-edge predecessor input, and
  schema validation on the surviving (last-wins) response.
- resume: single path over the rehydrated state, gated on the current
  turn's responses for idempotency; already-run handoff successors are
  skipped (RunState.completed).
- state: NodeState.Interrupts + unexported interruptSchemas;
  RunState.completed; HasWaiting. No PendingRequest, no persisted
  run-state blob.
- workflowagent: detectResume uses ReconstructRunState and surfaces
  reconstruction (schema-validation) errors.

A node may raise multiple interrupts per activation. workflow and
workflowagent suites pass with -race.
…, Routes)

AppendEvent (in-memory) and the database storage layer dropped Event's
workflow fields when persisting: the in-memory copy omitted NodeInfo,
RequestedInput and Routes, and the database layer never serialized
NodeInfo or RequestedInput. History-based resume attributes interrupts
by NodeInfo.Path, so losing it broke HITL resume — a RequestInput
workflow (e.g. examples/workflow/hitl_simple) would re-prompt instead of
continuing after the reply.

Persist all three fields in both backends and add round-trip regression
tests for each.
@wolo-lab wolo-lab changed the base branch from v2 to wolo/cl-workflow June 8, 2026 08:11
Two resume-correctness fixes for dynamic orchestrators and HITL.

1. Cross-resume dedup. A dynamic node body re-runs from the top on
   resume, so every RunNode before the pause point would re-execute its
   child. rehydrateCache rebuilds the sub-scheduler's resultByPath from
   session events (child terminal events carry NodeInfo.Path + Output),
   so completed children with a stable WithRunID are served from cache.
   Mirrors adk-python's _rehydrate_from_events / DynamicNodeScheduler.

2. Terminal handoff asker now resumes. Resume only bumped its scheduled
   counter per scheduled successor, so a single-asker workflow (no
   successors) wrongly returned ErrNothingToResume. A matched handoff
   asker now counts as an effective resume itself, gated on
   answeredThisTurn (from a per-interrupt resolvedCount during
   rehydration) so a duplicate resume stays an idempotent no-op.
@wolo-lab wolo-lab force-pushed the wolo/cl-workflow branch from 17aa0ce to 6861671 Compare June 8, 2026 08:13
@wolo-lab wolo-lab force-pushed the wolo/workflow_hitl_integration_test branch from 634b9aa to 4f7cd27 Compare June 8, 2026 08:14
kdroste-google and others added 6 commits June 8, 2026 09:47
Replaced deprecated tool.Context with agent.ToolContext
Completes the symmetric input/output validation contract on the Node
interface, alongside the existing ValidateInput. The scheduler is
expected to invoke ValidateOutput on every yielded event whose output
is non-nil before forwarding the event to the consumer (wired up in a
follow-up).

Interface and conformance
- Add ValidateOutput(output any) (any, error) to the Node interface.
- Add explicit stubs on the two implementations that do not embed
  BaseNode: startNode and the test-only dummyNode.
- Extend the compile-time Node-conformance assertions in
  base_node_test.go to cover AgentNode, ToolNode, JoinNode,
  ParallelWorker, and WorkflowNode.

Default implementation on BaseNode
- BaseNode.ValidateOutput delegates to a shared defaultValidateOutput
  helper that validates the output against the node's outputSchema
  field (added in #911) when set, otherwise returns the output
  unchanged.
- The default deliberately performs no type coercion or Content/JSON
  fallback handling; ToolNode will override ValidateOutput to add its
  FunctionTool {"result": X} unwrap fallback in a follow-up.
Extends the console launcher's HITL prompt dispatch to handle
tool confirmation interrupts (toolconfirmation.FunctionCallName)
alongside the workflow input path added in the previous commit.

Detection path is unchanged — collectPendingInterrupts already
walks events name-agnostically via Event.LongRunningToolIDs.
This commit only adds a per-name case to the render and response
switches:

* renderToolConfirmationPrompt prints the confirmation hint after
  the standard "Agent -> " banner, or "Confirm <name>?" derived
  from the original function call as fallback.

* toolConfirmationResponseFromUserInput maps yes/y/true/confirm
  (case-insensitive) to {"confirmed": true}, everything else
  (including blank lines) to {"confirmed": false}.

Without this commit tool confirmation hits the generic fallback
which wraps the reply as {"result": <text>} — the transport works
(reply routes back by FunctionCall.ID) but the envelope does not
match what ctx.ToolConfirmation() expects, so the confirmation is
effectively unparseable.
Three end-to-end tests exercising the full pause/resume round-trip
through a real runner.Runner — not the lightweight mocks the
agent/workflowagent unit tests use. They verify that the contract
the engine relies on (FunctionCall.ID round-trips into a follow-up
FunctionResponse, runner.findAgentToRun routes by that ID, RunState
survives the turn boundary via session.State delta) actually holds
when the runner, session.InMemoryService, and the workflow agent
are wired together as a production user would wire them.

* TestRunner_WorkflowHITL_Roundtrip_Handoff exercises the default
  handoff resume path: turn 1 yields an event with
  LongRunningToolIDs and a synthesised adk_request_workflow_input
  FunctionCall part; turn 2 sends a matching FunctionResponse, the
  runner routes it back to the same agent, and the asker's
  successor receives the response payload as its input.

* TestRunner_WorkflowHITL_Roundtrip_ReEntry covers the re-entry
  path (NodeConfig.RerunOnResume = true) with the same runner
  setup: the asker is re-activated, observes the response via
  ctx.ResumedInput, and emits it as an output that flows to the
  successor.

* TestRunner_WorkflowHITL_FunctionResponseRoutedByID pins the
  runner-level routing contract: it asserts the interrupt event's
  Author equals the workflow agent name (used by findAgentToRun)
  and that the second turn does not produce a fresh interrupt
  (it would if findAgentToRun fell back to the root agent and
  treated the FunctionResponse as a new user message).

Tests are in runner_test (external test package) so they exercise
only the public Runner API, no internals.
The re-entry asker emitted the resumed response via the obsolete
Event.Actions.StateDelta["output"] channel, which the v2 scheduler no
longer reads (node output now flows through Event.Output). On v2 the
handler received an empty input and the test failed. Switch to
ev.Output, matching the canonical HITL tests in workflowagent.
Adds TestRunner_WorkflowHITL_DynamicOrchestrator_DedupAndResume covering
the end-to-end acceptance scenario for b/515644762: a dynamic
orchestrator runs two children via RunNode, the second suspends on a
HITL interrupt, and on resume the first child must be served from cache
(not re-executed) while the second observes the user's response.
@wolo-lab wolo-lab force-pushed the wolo/workflow_hitl_integration_test branch from 4f7cd27 to 681eb17 Compare June 8, 2026 09:48
Base automatically changed from wolo/cl-workflow to v2 June 8, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants