Skip to content

Missing durable sub-orchestration primitive (df.call_child / df.await_instance) #152

Description

@pinodeca

Summary

pg_durable has no durable parent-waits-for-child primitive. There is no df.call_child, df.start_sub_workflow, df.child_workflow, or df.await_instance in the codebase (grep for start_sub_workflow|call_child|child_workflow|sub_orchestration in src/ returns zero hits as of v0.2.1).

The only documented function that mentions waiting on another instance is df.wait_for_completion, which is a synchronous polling helper that blocks the calling backend session and cannot be composed into a graph (tracked separately — companion to #149). That leaves customers with no supported way to express "this orchestration starts a child orchestration and durably resumes when it finishes."

Workarounds available today all have material drawbacks:

  • Application-side polling by calling df.wait_for_completion from a backend session — not durable; the session holds a backend for the whole wait; a disconnect abandons the wait silently.
  • Hand-rolled polling inside the parent graph using df.loop + df.sleep + a df.sql that checks df.instances.status — works, but verbose, racy (no event-driven wakeup), and forces every customer to reinvent the same fragile pattern.
  • df.signal + df.wait_for_signal where the child explicitly signals the parent on completion — requires cooperation from the child and is partially broken today: see df.signal not propagated into child sub-orchestrations (breaks wait_for_signal inside df.race) #150 (signals don't propagate into child sub-orchestrations).

Suggested design

Add an explicit child-orchestration primitive. Strawman:

-- Start a child as part of the parent graph; returns a future that resolves
-- to the child's output (or raises on Failed/Canceled).
df.call_child(
  graph    text,           -- same shape as df.start's first arg
  label    text DEFAULT NULL,
  options  jsonb DEFAULT NULL  -- e.g. {"timeout_seconds": 300, "on_failure": "raise"}
) RETURNS text  -- future envelope, e.g. {"node_type":"CALL_CHILD", ...}

and/or a lower-level wait-on-existing-instance primitive:

-- Durably wait on an already-started instance; resolves to its terminal status/output.
df.await_instance(
  instance_id text,
  timeout_seconds int DEFAULT NULL
) RETURNS text  -- future envelope

Both must return future envelopes (not text) so they compose cleanly inside df.seq / df.join / df.race. Internally they should be implemented as duroxide sub-orchestrations / external-event awaits so the parent suspends without holding a backend and resumes deterministically when the child reaches a terminal state.

Open design questions to settle in the spec:

  1. Cancellation propagation: does cancelling the parent cancel running children started via df.call_child?
  2. Failure semantics: does a Failed / Canceled child raise inside the parent (default) or surface as a typed result the parent can branch on?
  3. Identity/labelling: is the child's instance_id exposed to the parent (useful for monitoring / signals)?
  4. Variable & label inheritance: child inherits parent's df vars? Probably no by default; allow opt-in via options.
  5. Resource accounting: child counts against the parent's user's rate limits (Rate-limit df.start() to prevent DoS (security review D-1, D-2) #139), not the parent process.

These should land in a short design doc under docs/ (similar to docs/nested-graph-design.md) before implementation.

Why this matters

Sub-orchestration / fan-out-fan-in / parent-waits-for-child is one of the headline use cases for any durable functions library. Without a first-class primitive, the only honest answer to a customer asking "how do I have one orchestration kick off another and wait for it?" is "you can't, durably." That's a significant gap in the v1 story.

Related

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions