Wolo/workflow hitl integration test by wolo-lab · Pull Request #980 · google/adk-go

wolo-lab · 2026-06-07T21:36:39Z

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

1. Link to an existing issue (if applicable):

Closes: #issue_number
Related: #issue_number

2. Or, if no issue exists, describe the change:

If applicable, please follow the issue templates to provide as much detail as
possible.

Problem:
A clear and concise description of what the problem is.

Solution:
A clear and concise description of what you want to happen and why you choose
this solution.

Testing Plan

Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.

Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.

Please include a summary of passed go test results.

Manual End-to-End (E2E) Tests:

Please provide instructions on how to manually test your changes, including any
necessary setup or configuration. Please provide logs or screenshots to help
reviewers better understand the fix.

Checklist

I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.

Additional context

Add any other context or screenshots about the feature request here.

…tion Workflow-engine support for human-in-the-loop, unified on a single mechanism — history rehydration — matching adk-python (no persisted run-state event, no PendingRequest field). - scheduler: per-event back-pressure handshake (a non-partial function-response is persisted before the node's flow rebuilds the next model request, fixing a non-deterministic re-issue race); pause a node on accumulated Event.LongRunningToolIDs (RequestInput rides on them); stamp NodeInfo.Path = node name on static node events so rehydration can attribute interrupts (dynamic children fold into their static ancestor). - persistence: ReconstructRunState ports adk-python's _reconstruct_node_states + _infer_node_state — per-node scan (interrupts, resolved user responses, schemas, output), status inference (WAITING / PENDING+ResumedInputs re-entry / COMPLETED+Output handoff), backward-edge predecessor input, and schema validation on the surviving (last-wins) response. - resume: single path over the rehydrated state, gated on the current turn's responses for idempotency; already-run handoff successors are skipped (RunState.completed). - state: NodeState.Interrupts + unexported interruptSchemas; RunState.completed; HasWaiting. No PendingRequest, no persisted run-state blob. - workflowagent: detectResume uses ReconstructRunState and surfaces reconstruction (schema-validation) errors. A node may raise multiple interrupts per activation. workflow and workflowagent suites pass with -race.

…, Routes) AppendEvent (in-memory) and the database storage layer dropped Event's workflow fields when persisting: the in-memory copy omitted NodeInfo, RequestedInput and Routes, and the database layer never serialized NodeInfo or RequestedInput. History-based resume attributes interrupts by NodeInfo.Path, so losing it broke HITL resume — a RequestInput workflow (e.g. examples/workflow/hitl_simple) would re-prompt instead of continuing after the reply. Persist all three fields in both backends and add round-trip regression tests for each.

Two resume-correctness fixes for dynamic orchestrators and HITL. 1. Cross-resume dedup. A dynamic node body re-runs from the top on resume, so every RunNode before the pause point would re-execute its child. rehydrateCache rebuilds the sub-scheduler's resultByPath from session events (child terminal events carry NodeInfo.Path + Output), so completed children with a stable WithRunID are served from cache. Mirrors adk-python's _rehydrate_from_events / DynamicNodeScheduler. 2. Terminal handoff asker now resumes. Resume only bumped its scheduled counter per scheduled successor, so a single-asker workflow (no successors) wrongly returned ErrNothingToResume. A matched handoff asker now counts as an effective resume itself, gated on answeredThisTurn (from a per-interrupt resolvedCount during rehydration) so a duplicate resume stays an idempotent no-op.

Replaced deprecated tool.Context with agent.ToolContext

Completes the symmetric input/output validation contract on the Node interface, alongside the existing ValidateInput. The scheduler is expected to invoke ValidateOutput on every yielded event whose output is non-nil before forwarding the event to the consumer (wired up in a follow-up). Interface and conformance - Add ValidateOutput(output any) (any, error) to the Node interface. - Add explicit stubs on the two implementations that do not embed BaseNode: startNode and the test-only dummyNode. - Extend the compile-time Node-conformance assertions in base_node_test.go to cover AgentNode, ToolNode, JoinNode, ParallelWorker, and WorkflowNode. Default implementation on BaseNode - BaseNode.ValidateOutput delegates to a shared defaultValidateOutput helper that validates the output against the node's outputSchema field (added in #911) when set, otherwise returns the output unchanged. - The default deliberately performs no type coercion or Content/JSON fallback handling; ToolNode will override ValidateOutput to add its FunctionTool {"result": X} unwrap fallback in a follow-up.

Extends the console launcher's HITL prompt dispatch to handle tool confirmation interrupts (toolconfirmation.FunctionCallName) alongside the workflow input path added in the previous commit. Detection path is unchanged — collectPendingInterrupts already walks events name-agnostically via Event.LongRunningToolIDs. This commit only adds a per-name case to the render and response switches: * renderToolConfirmationPrompt prints the confirmation hint after the standard "Agent -> " banner, or "Confirm <name>?" derived from the original function call as fallback. * toolConfirmationResponseFromUserInput maps yes/y/true/confirm (case-insensitive) to {"confirmed": true}, everything else (including blank lines) to {"confirmed": false}. Without this commit tool confirmation hits the generic fallback which wraps the reply as {"result": <text>} — the transport works (reply routes back by FunctionCall.ID) but the envelope does not match what ctx.ToolConfirmation() expects, so the confirmation is effectively unparseable.

Three end-to-end tests exercising the full pause/resume round-trip through a real runner.Runner — not the lightweight mocks the agent/workflowagent unit tests use. They verify that the contract the engine relies on (FunctionCall.ID round-trips into a follow-up FunctionResponse, runner.findAgentToRun routes by that ID, RunState survives the turn boundary via session.State delta) actually holds when the runner, session.InMemoryService, and the workflow agent are wired together as a production user would wire them. * TestRunner_WorkflowHITL_Roundtrip_Handoff exercises the default handoff resume path: turn 1 yields an event with LongRunningToolIDs and a synthesised adk_request_workflow_input FunctionCall part; turn 2 sends a matching FunctionResponse, the runner routes it back to the same agent, and the asker's successor receives the response payload as its input. * TestRunner_WorkflowHITL_Roundtrip_ReEntry covers the re-entry path (NodeConfig.RerunOnResume = true) with the same runner setup: the asker is re-activated, observes the response via ctx.ResumedInput, and emits it as an output that flows to the successor. * TestRunner_WorkflowHITL_FunctionResponseRoutedByID pins the runner-level routing contract: it asserts the interrupt event's Author equals the workflow agent name (used by findAgentToRun) and that the second turn does not produce a fresh interrupt (it would if findAgentToRun fell back to the root agent and treated the FunctionResponse as a new user message). Tests are in runner_test (external test package) so they exercise only the public Runner API, no internals.

The re-entry asker emitted the resumed response via the obsolete Event.Actions.StateDelta["output"] channel, which the v2 scheduler no longer reads (node output now flows through Event.Output). On v2 the handler received an empty input and the test failed. Switch to ev.Output, matching the canonical HITL tests in workflowagent.

Adds TestRunner_WorkflowHITL_DynamicOrchestrator_DedupAndResume covering the end-to-end acceptance scenario for b/515644762: a dynamic orchestrator runs two children via RunNode, the second suspends on a HITL interrupt, and on resume the first child must be served from cache (not re-executed) while the second observes the user's response.

wolo-lab added 2 commits June 4, 2026 20:35

wolo-lab changed the base branch from v2 to wolo/cl-workflow June 8, 2026 08:11

wolo-lab force-pushed the wolo/cl-workflow branch from 17aa0ce to 6861671 Compare June 8, 2026 08:13

wolo-lab force-pushed the wolo/workflow_hitl_integration_test branch from 634b9aa to 4f7cd27 Compare June 8, 2026 08:14

kdroste-google and others added 6 commits June 8, 2026 09:47

Replaced deprecated tool.Context with agent.ToolContext (#951)

e6f52bc

Replaced deprecated tool.Context with agent.ToolContext

wolo-lab force-pushed the wolo/workflow_hitl_integration_test branch from 4f7cd27 to 681eb17 Compare June 8, 2026 09:48

Base automatically changed from wolo/cl-workflow to v2 June 8, 2026 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wolo/workflow hitl integration test#980

Wolo/workflow hitl integration test#980
wolo-lab wants to merge 9 commits into
v2from
wolo/workflow_hitl_integration_test

wolo-lab commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wolo-lab commented Jun 7, 2026

Link to Issue or Description of Change

Testing Plan

Checklist

Additional context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants