feat(runner): HITL via long-running interrupts with history rehydra…#960
Merged
Conversation
418382f to
293628e
Compare
…tion Workflow-engine support for human-in-the-loop, unified on a single mechanism — history rehydration — matching adk-python (no persisted run-state event, no PendingRequest field). - scheduler: per-event back-pressure handshake (a non-partial function-response is persisted before the node's flow rebuilds the next model request, fixing a non-deterministic re-issue race); pause a node on accumulated Event.LongRunningToolIDs (RequestInput rides on them); stamp NodeInfo.Path = node name on static node events so rehydration can attribute interrupts (dynamic children fold into their static ancestor). - persistence: ReconstructRunState ports adk-python's _reconstruct_node_states + _infer_node_state — per-node scan (interrupts, resolved user responses, schemas, output), status inference (WAITING / PENDING+ResumedInputs re-entry / COMPLETED+Output handoff), backward-edge predecessor input, and schema validation on the surviving (last-wins) response. - resume: single path over the rehydrated state, gated on the current turn's responses for idempotency; already-run handoff successors are skipped (RunState.completed). - state: NodeState.Interrupts + unexported interruptSchemas; RunState.completed; HasWaiting. No PendingRequest, no persisted run-state blob. - workflowagent: detectResume uses ReconstructRunState and surfaces reconstruction (schema-validation) errors. A node may raise multiple interrupts per activation. workflow and workflowagent suites pass with -race.
293628e to
17cd387
Compare
9 tasks
…, Routes) AppendEvent (in-memory) and the database storage layer dropped Event's workflow fields when persisting: the in-memory copy omitted NodeInfo, RequestedInput and Routes, and the database layer never serialized NodeInfo or RequestedInput. History-based resume attributes interrupts by NodeInfo.Path, so losing it broke HITL resume — a RequestInput workflow (e.g. examples/workflow/hitl_simple) would re-prompt instead of continuing after the reply. Persist all three fields in both backends and add round-trip regression tests for each.
wolo-lab
added a commit
that referenced
this pull request
Jun 5, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
wolo-lab
added a commit
that referenced
this pull request
Jun 5, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
wolo-lab
added a commit
that referenced
this pull request
Jun 8, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
Two resume-correctness fixes for dynamic orchestrators and HITL. 1. Cross-resume dedup. A dynamic node body re-runs from the top on resume, so every RunNode before the pause point would re-execute its child. rehydrateCache rebuilds the sub-scheduler's resultByPath from session events (child terminal events carry NodeInfo.Path + Output), so completed children with a stable WithRunID are served from cache. Mirrors adk-python's _rehydrate_from_events / DynamicNodeScheduler. 2. Terminal handoff asker now resumes. Resume only bumped its scheduled counter per scheduled successor, so a single-asker workflow (no successors) wrongly returned ErrNothingToResume. A matched handoff asker now counts as an effective resume itself, gated on answeredThisTurn (from a per-interrupt resolvedCount during rehydration) so a duplicate resume stays an idempotent no-op.
17aa0ce to
6861671
Compare
wolo-lab
added a commit
that referenced
this pull request
Jun 8, 2026
End-to-end HITL coverage through a real Runner, kept separate from the feature PR (#960) to keep it focused. Covers handoff round-trip, re-entry resume, FunctionResponse routing by ID, and the dynamic orchestrator dedup+HITL scenario (b/515644762): two children run sequentially via RunNode; on resume the first is served from cache and the second delivers the human response.
wolo-lab
added a commit
that referenced
this pull request
Jun 8, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
wolo-lab
added a commit
that referenced
this pull request
Jun 8, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
wolo-lab
added a commit
that referenced
this pull request
Jun 8, 2026
…ution Add NodeInfo.OutputFor: the node paths an event's Output counts for — the emitter plus any WithUseAsOutput delegating ancestors. A delegating child's single event is stamped OutputFor=[child, parent, ...] and flows up, and the parent no longer re-emits a duplicate terminal output event (full suppression, matching adk-python's _output_delegated + output_for). Resume attributes a descendant's output to its delegating ancestors via OutputFor. Every output event records OutputFor (own path minimum), mirroring adk-python _enrich_event. Built on the temp integration branch (#960 + #920 + #966); rebase onto v2 once those merge.
hanorik
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Workflow-engine support for human-in-the-loop (HITL), built on a single
mechanism — history rehydration — matching adk-python. Paused run state is
reconstructed from session events each turn; there is no persisted run-state
blob and no
PendingRequestfield.What changed
persisted before the node's flow rebuilds the next model request, fixing a
non-deterministic race where the model re-issued the same tool call.
Mirrors adk-python's
enqueue_event/processed_signal.Event.LongRunningToolIDs(aRequestInputpause rides on them) — no synthetic pause event.
NodeInfo.Path = node nameon static node events so rehydration canattribute interrupts back to their node (dynamic children fold into their
static ancestor).
ReconstructRunStateports adk-python's_reconstruct_node_states+_infer_node_state: per-node scan (interrupts,resolved user responses, schemas, output), status inference
(
WAITING/PENDING+ResumedInputsre-entry /COMPLETED+Outputhandoff), backward-edge predecessor input, and schema validation on the
surviving (last-wins) response. Removes
LoadRunState/NewRunStateEvent/RunStateSessionKey.turn's responses for idempotency; already-run handoff successors are skipped
(
RunState.completed). Removes the separatePendingRequestloop.NodeState.Interrupts+ unexportedinterruptSchemas;RunState.completed;HasWaiting. NoPendingRequest, no persisted blob.detectResumeusesReconstructRunStateand surfacesreconstruction (schema-validation) errors instead of silently falling
through to a fresh
Run.A node may now raise multiple interrupts per activation.