Skip to content

Add tests for TryNode wrapping non-streaming InlinePromptNode with PROVIDER_ERROR#3738

Closed
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1775486812-try-node-provider-error-hang-repro
Closed

Add tests for TryNode wrapping non-streaming InlinePromptNode with PROVIDER_ERROR#3738
devin-ai-integration[bot] wants to merge 2 commits intomainfrom
devin/1775486812-try-node-provider-error-hang-repro

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 6, 2026

Summary

Adds two tests for TryNode wrapping a non-streaming (stream_enabled=False) InlinePromptNode that encounters a PROVIDER_ERROR. Both tests use @pytest.mark.timeout(5) to detect hangs.

  1. test_try_node__prompt_node_non_streaming_provider_error__workflow_completes — Mocks adhoc_execute_prompt to return a RejectedAdHocExecutePromptEvent with PROVIDER_ERROR. Asserts the workflow fulfills with the error captured in TryNode.Outputs.error (including raw_data={}).

  2. test_try_node__prompt_node_non_streaming_connection_error__workflow_completes — Mocks adhoc_execute_prompt to raise a raw ConnectionResetError. Asserts the workflow fulfills, verifies adhoc_execute_prompt was called, and checks the error code (NODE_EXECUTION) and message content.

Both tests also assert that vellum_adhoc_prompt_client.adhoc_execute_prompt was called exactly once, confirming the non-streaming API path was exercised.

⚠️ Important: Both tests currently PASS on main. The reported hang has not been reproduced yet. Multiple variations were attempted (streaming vs non-streaming, REJECTED event vs raw exception, with and without on_error_code) and all pass. More details about the exact production conditions may be needed to reproduce the hang.

Updates since last revision

  • Fixed raw_data={} assertion mismatch that caused CI failure (main's WorkflowError now includes raw_data and stacktrace fields)
  • Corrected connection error test to assert NODE_EXECUTION error code (not INTERNAL_ERROR) and verify message content
  • Added assert_called_once() on adhoc_execute_prompt in both tests to confirm the non-streaming path was actually invoked (per review feedback)
  • Rebased on latest main

Review & Testing Checklist for Human

  • These tests pass, not fail — the user asked for tests that fail on main to prove the bug is reproduced. This PR does NOT achieve that. Determine whether these tests still capture the right scenario or if the reproduction conditions are different.
  • Verify whether the mock setup (adhoc_execute_prompt.return_value = RejectedAdHocExecutePromptEvent(...)) accurately reflects what happens in production when a non-streaming prompt gets a PROVIDER_ERROR. Could the real failure path involve the streaming endpoint being called despite stream_enabled=False?
  • Consider whether the hang might require a downstream node consuming TryNode's results output, or whether it only manifests when the workflow is invoked via Vembda rather than the SDK's WorkflowRunner.

Notes

  • StateMeta.__deepcopy__ preserves Queue references without deep-copying them. If a streaming output Queue never receives the undefined sentinel (e.g., due to an exception mid-stream), and something later resolves that Queue, it would block forever. However, in all tested scenarios the cleanup path handles this correctly.
  • The non-streaming code path in BaseInlinePromptNode._process_prompt_event_stream skips the next(prompt_event_stream) call (INITIATED event skip) when stream_enabled=False, but this doesn't appear to cause issues in the tested scenarios.

Link to Devin session: https://app.devin.ai/sessions/ee4bcf35c06e4c85b305580388d3d35d
Requested by: @dvargas92495

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e6adffce4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

assert terminal_event.name == "workflow.execution.fulfilled"

# AND the error output is present
assert terminal_event.outputs.error is not None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Assert non-streaming call in connection-reset test

This assertion is too weak to prove the intended regression path: TryNode.wrap() (without on_error_code) will fulfill the workflow with some error for any subworkflow rejection, so the test can pass even if InlinePromptNode never calls adhoc_execute_prompt (for example, if it accidentally uses the streaming path or raises a different internal error). To make this test actionable for future regressions, assert that vellum_adhoc_prompt_client.adhoc_execute_prompt was called and validate the captured error details/code, not just non-nullity.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I've added vellum_adhoc_prompt_client.adhoc_execute_prompt.assert_called_once() to both tests, and strengthened the connection-reset test to also assert the error code (NODE_EXECUTION) and that the message contains "Connection reset by peer".

devin-ai-integration Bot and others added 2 commits April 6, 2026 15:54
…OVIDER_ERROR

Tests cover two scenarios:
1. Non-streaming API returns REJECTED event with PROVIDER_ERROR
2. Non-streaming API raises raw ConnectionResetError

Both tests use pytest.mark.timeout(5) to detect hangs.

Co-Authored-By: vargas@vellum.ai <vargas@vellum.ai>
…as called

Co-Authored-By: vargas@vellum.ai <vargas@vellum.ai>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1775486812-try-node-provider-error-hang-repro branch from 5e6adff to 5bea144 Compare April 6, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant