Skip to content

df.instance_nodes leaves race-loser nodes running or pending after race completion #171

Description

@pinodeca

Summary

After a df.race(...) workflow completes, df.instance_nodes can still show nodes from the losing branch as running or pending.

The underlying orchestration may have cancelled the losing branch correctly, but the operator-facing node view continues to look like work is still active.

Expected behavior

Nodes in the losing branch of a completed race should be marked with a terminal status such as cancelled or dropped, consistent with the parent workflow having completed.

Actual behavior

df.status(instance_id) reports the race workflow as completed, while df.instance_nodes still shows losing-branch nodes such as signal waits or downstream SQL nodes as running or pending.

Repro shape

  1. Start a workflow using df.race(...) with one branch that wins and another branch that waits or has downstream nodes.
  2. Wait for the parent workflow to complete.
  3. Query df.instance_nodes for the instance.
  4. Observe losing-branch nodes still reported as active or pending.

Impact

Diagnostics and dashboards that use df.instance_nodes to find stuck work can report ghost in-flight work for workflows that have already completed.

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions