fix: use ambient tokio runtime in ParquetObjectReader::spawn (#8231)#10168
Open
sandy-sachin7 wants to merge 1 commit into
Open
fix: use ambient tokio runtime in ParquetObjectReader::spawn (#8231)#10168sandy-sachin7 wants to merge 1 commit into
sandy-sachin7 wants to merge 1 commit into
Conversation
…8231) When ParquetObjectReader is constructed without an explicit runtime, spawn() would run async closures inline — this breaks object store implementations (S3, HTTP) that rely on tokio::spawn for connection pooling and DNS resolution. The fix tries Handle::try_current() to discover an ambient tokio runtime before falling back to inline execution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #8231.
Rationale for this change
When
ParquetObjectReaderis constructed without an explicitruntime(i.e., viaParquetRecordBatchStreamBuilder::new(reader)), thespawnmethod falls through to theNonebranch and runs async closures inline. For object store implementations (S3, HTTP, GCS viaobject_storecrate), this breaks connection pooling and DNS resolution because the underlyingreqwestclient relies ontokio::spawnto propagate the runtime context.The result is a panic: "there is no reactor running, must be called from the context of a Tokio 1.x runtime"
What changes are included in this PR?
Changed
ParquetObjectReader::spawnto first checkHandle::try_current()whenself.runtimeisNone, discovering an ambient tokio runtime if one exists. Only falls back to inline execution when no tokio runtime context is available at all.Before:
After:
Are these changes tested?
1166 parquet lib tests pass. No new tests added as the existing test infrastructure doesn't exercise the
spawn()method with async object store implementations (those are integration-level tests).Are there any user-facing changes?
ParquetObjectReaderwill now correctly discover a tokio runtime when used inside#[tokio::main]or any other tokio runtime context, even without an explicit runtime being configured.