fix: workflow incorrectly marked as completed while nodes are still executing by tomerqodo · Pull Request #26 · qodo-benchmark/dify-combined-coderabbit

tomerqodo · 2026-01-22T03:36:21Z

Benchmark PR from qodo-benchmark#441

Summary by CodeRabbit

Release Notes

Bug Fixes
- Fixed skip propagation handling in workflow graphs to properly manage downstream node execution state.
Tests
- Added comprehensive test coverage for skip propagation behavior across various edge states and recursive propagation scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…s node is skipped

coderabbitai · 2026-01-22T03:36:55Z

Walkthrough

This change modifies the skip propagation logic in the graph traversal engine to invoke start_execution for downstream nodes when incoming edges are TAKEN, and reorders operations in _propagate_skip_to_node to recursively propagate skips before marking edges as skipped. Comprehensive unit tests verify the propagation behavior across unknown, taken, and skipped edge states.

Changes

Cohort / File(s)	Summary
Core Skip Propagation Logic `api/core/workflow/graph_engine/graph_traversal/skip_propagator.py`	Modified `propagate_skip_from_edge` to invoke `start_execution` and enqueue downstream node when incoming edge is TAKEN; reordered `_propagate_skip_to_node` to call `propagate_skip_from_edge` recursively before marking edges as skipped.
Test Package & Suite `api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/__init__.py`, `api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/test_skip_propagator.py`	Added package docstring and new test class `TestSkipPropagator` with 6 test methods validating skip propagation across unknown edges, taken edges, all-skipped edges, recursive chains, and mixed edge states using mocked Graph and GraphStateManager instances.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hops through edges, skip by skip,
Reordering the traversal trip!
Taken paths spring to life with care,
While skipped ones propagate with flair. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is minimal and does not follow the required template structure. It lacks a detailed summary, context, motivation, issue link, and test/documentation checklist responses.	Expand the description to include a detailed summary of the fix, link the related issue using 'Fixes #', explain the root cause and solution, and address the checklist items from the template.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: fixing a workflow completion issue when nodes are still executing due to lost start_execution calls in skip scenarios.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@api/core/workflow/graph_engine/graph_traversal/skip_propagator.py`:
- Around line 83-86: The bug is that in _propagate_skip_to_node the outgoing
edge is propagated before it is marked skipped, causing propagate_skip_from_edge
to see the edge as UNKNOWN and abort; change the order so you call
self._state_manager.mark_edge_skipped(edge.id) before calling
self.propagate_skip_from_edge(edge.id), mirroring the correct ordering used in
skip_branch_paths (mark then propagate) to ensure downstream analyze_edge_states
sees the edge as SKIPPED.
- Around line 60-65: propagate_skip_from_edge currently calls
propagate_skip_from_edge before marking the edge skipped and may call
start_execution/enqueue_node multiple times while printing; fix by first calling
mark_edge_skipped for the edge (use the same edge identifier used in
propagate_skip_from_edge), then call propagate_skip_from_edge; replace the
print(f"Starting execution for node: {downstream_node_id}") with a logger call;
and guard duplicate enqueue by checking the node's ready/enqueued state via
_state_manager before invoking _state_manager.enqueue_node(downstream_node_id)
(or ensure the ready queue deduplicates) so
_state_manager.start_execution(downstream_node_id) and enqueue are only invoked
once per node transition.

🧹 Nitpick comments (1)

api/core/workflow/graph_engine/graph_traversal/skip_propagator.py (1)
61-63: Use logger for execution flow messaging to maintain consistency with codebase conventions.

The project consistently uses structured logging throughout graph_engine/ (all related modules follow logger = logging.getLogger(__name__)). The print() statement at line 62 is the only instance in the entire directory and bypasses log configuration. Replace with logger.debug() to match the established pattern.
♻️ Suggested change
+import logging
+
+logger = logging.getLogger(__name__)
@@
-            print(f"Starting execution for node: {downstream_node_id}")
+            logger.debug("Starting execution for node: %s", downstream_node_id)

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b1febb2 and 3a2bbf9.

📒 Files selected for processing (3)

api/core/workflow/graph_engine/graph_traversal/skip_propagator.py
api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/__init__.py
api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/test_skip_propagator.py

🧰 Additional context used

🧬 Code graph analysis (1)

api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/test_skip_propagator.py (2)

api/core/workflow/graph/graph.py (2)

Graph (36-372)

get_incoming_edges (364-372)

api/core/workflow/graph_engine/graph_traversal/skip_propagator.py (4)

SkipPropagator (14-97)

propagate_skip_from_edge (37-69)

_propagate_skip_to_node (71-86)

skip_branch_paths (88-97)

🔇 Additional comments (4)

api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/__init__.py (1)

1-1: Nice touch adding the package docstring.
Keeps the test package self-describing.

api/tests/unit_tests/core/workflow/graph_engine/graph_traversal/test_skip_propagator.py (3)

1-114: Solid coverage for unknown/taken/all-skipped edge states.
The tests exercise the core branch conditions well and validate key state-manager interactions.

115-185: Good coverage for skip propagation and branch skipping.
These tests capture the expected calls for node/edge skipping and propagation paths.

186-307: Recursive and mixed-state scenarios are well exercised.
Nice to see multi-step propagation and state reset behavior covered.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-22T03:43:40Z

api/core/workflow/graph_engine/graph_traversal/skip_propagator.py

        if edge_states["has_taken"]:
-            # Enqueue node
+            # Start execution and enqueue node
+            print(f"Starting execution for node: {downstream_node_id}")
+            self._state_manager.start_execution(downstream_node_id)
            self._state_manager.enqueue_node(downstream_node_id)
            return


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n -P 'class GraphStateManager\b' -C 2

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 222

🏁 Script executed:

rg -n -P 'def (start_execution|enqueue_node|analyze_edge_states)\b' -C 3

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 222

🏁 Script executed:

rg -n -P 'def (is_.*enqueued|is_.*executing|has_.*execution)\b' -C 3

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 676

🏁 Script executed:

# Also find the skip_propagator file and examine the full context find . -name "skip_propagator.py" -type f

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 150

🏁 Script executed:

cat -n ./api/core/workflow/graph_engine/graph_traversal/skip_propagator.py

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 3859

🏁 Script executed:

cat -n ./api/core/workflow/graph_engine/graph_state_manager.py | head -100

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 3585

🏁 Script executed:

rg -n -A 20 "def start_execution\b" ./api/core/workflow/graph_engine/graph_state_manager.py

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 748

🏁 Script executed:

rg -n -A 20 "def enqueue_node\b" ./api/core/workflow/graph_engine/graph_state_manager.py

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 793

Fix edge-state ordering and guard against duplicate node start/enqueue calls.

Line 85–86: propagate_skip_from_edge is called before mark_edge_skipped, leaving the edge in UNKNOWN state. When the downstream node analyzes its incoming edges, it encounters this UNKNOWN edge and returns early (line 56–57), blocking skip propagation. Reverse the order: mark the edge as skipped first.

Line 60–65: If a node has multiple incoming edges and propagate_skip_from_edge is called for each, the condition has_taken=True will trigger start_execution + enqueue_node multiple times. Although start_execution uses an idempotent set, enqueue_node calls _ready_queue.put() without deduplication, risking duplicate queue entries. Either guard by checking node state before enqueue, or verify the queue deduplicates on put.

Line 62: Replace print() with logging.

🤖 Prompt for AI Agents

In `@api/core/workflow/graph_engine/graph_traversal/skip_propagator.py` around lines 60 - 65, propagate_skip_from_edge currently calls propagate_skip_from_edge before marking the edge skipped and may call start_execution/enqueue_node multiple times while printing; fix by first calling mark_edge_skipped for the edge (use the same edge identifier used in propagate_skip_from_edge), then call propagate_skip_from_edge; replace the print(f"Starting execution for node: {downstream_node_id}") with a logger call; and guard duplicate enqueue by checking the node's ready/enqueued state via _state_manager before invoking _state_manager.enqueue_node(downstream_node_id) (or ensure the ready queue deduplicates) so _state_manager.start_execution(downstream_node_id) and enqueue are only invoked once per node transition.

coderabbitai · 2026-01-22T03:43:40Z

api/core/workflow/graph_engine/graph_traversal/skip_propagator.py

        for edge in outgoing_edges:
-            self._state_manager.mark_edge_skipped(edge.id)
            # Recursively propagate skip
            self.propagate_skip_from_edge(edge.id)
+            self._state_manager.mark_edge_skipped(edge.id)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, let's find and examine the file and the specific lines find . -name "skip_propagator.py" -type f | head -5

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 222

🏁 Script executed:

#!/bin/bash # Read the file to understand the context around lines 83-86 cat -n api/core/workflow/graph_engine/graph_traversal/skip_propagator.py | sed -n '70,100p'

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 1268

🏁 Script executed:

#!/bin/bash # Search for the propagate_skip_from_edge method implementation rg -n "def propagate_skip_from_edge" -A 30 api/core/workflow/graph_engine/graph_traversal/skip_propagator.py

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 1418

🏁 Script executed:

#!/bin/bash # Search for analyze_edge_states to understand how it treats unset edge states rg -n "def analyze_edge_states" -A 20

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 1972

🏁 Script executed:

#!/bin/bash # Check how mark_edge_skipped works rg -n "def mark_edge_skipped" -A 10

Repository: qodo-benchmark/dify-combined-coderabbit

Length of output: 1061

Mark edge skipped before recursing into downstream propagation.

In _propagate_skip_to_node, calling propagate_skip_from_edge before mark_edge_skipped causes the edge to remain in UNKNOWN state during downstream analysis. When propagate_skip_from_edge retrieves the downstream node's incoming edges and calls analyze_edge_states, it detects the current edge as UNKNOWN, triggering an early return (line 56-57 in propagate_skip_from_edge). This blocks deeper skip propagation—a correctness regression.

The correct pattern is already implemented in skip_branch_paths (lines 95-97), where the edge is marked before propagation. Apply the same order here.

Suggested fix

for edge in outgoing_edges: - # Recursively propagate skip - self.propagate_skip_from_edge(edge.id) - self._state_manager.mark_edge_skipped(edge.id) + self._state_manager.mark_edge_skipped(edge.id) + # Recursively propagate skip + self.propagate_skip_from_edge(edge.id)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for edge in outgoing_edges:

self._state_manager.mark_edge_skipped(edge.id)

# Recursively propagate skip

self.propagate_skip_from_edge(edge.id)

self._state_manager.mark_edge_skipped(edge.id)

for edge in outgoing_edges:

self._state_manager.mark_edge_skipped(edge.id)

# Recursively propagate skip

self.propagate_skip_from_edge(edge.id)

🤖 Prompt for AI Agents

In `@api/core/workflow/graph_engine/graph_traversal/skip_propagator.py` around lines 83 - 86, The bug is that in _propagate_skip_to_node the outgoing edge is propagated before it is marked skipped, causing propagate_skip_from_edge to see the edge as UNKNOWN and abort; change the order so you call self._state_manager.mark_edge_skipped(edge.id) before calling self.propagate_skip_from_edge(edge.id), mirroring the correct ordering used in skip_branch_paths (mark then propagate) to ensure downstream analyze_edge_states sees the edge as SKIPPED.

sai and others added 4 commits January 21, 2026 15:55

Fix lost start_execution after enqueue_node when the last previou…

dd89885

…s node is skipped

Add unit test

52481e7

[autofix.ci] apply automated fixes

305341a

update pr

3a2bbf9

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: workflow incorrectly marked as completed while nodes are still executing#26

tomerqodo commented Jan 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

coderabbitai bot Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tomerqodo commented Jan 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tomerqodo commented Jan 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 22, 2026 •

edited

Loading