-
Notifications
You must be signed in to change notification settings - Fork 276
fix: Prevent native write when input is not Arrow format #3227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This test verifies that when native_comet scan doesn't support complex types, the native write correctly falls back to Spark instead of failing at runtime with "Comet execution only takes Arrow Arrays" error. Also refactors the test helper methods to reduce code duplication. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3227 +/- ##
============================================
+ Coverage 56.12% 60.05% +3.92%
- Complexity 976 1442 +466
============================================
Files 119 172 +53
Lines 11743 15941 +4198
Branches 2251 2633 +382
============================================
+ Hits 6591 9573 +2982
- Misses 4012 5038 +1026
- Partials 1140 1330 +190 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| case op: DataWritingCommandExec => | ||
| convertToComet(op, CometDataWritingCommand).getOrElse(op) | ||
| // Get the actual data source child that will feed data to the native writer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more generic to do this in CometExecRule.convertToComet -
if (op.children.nonEmpty && !op.children.forall(_.isInstanceOf[CometNativeExec])) {
// For operators like writes, require all children to be native
if (requiresNativeChildren(handler)) {
return None // Fallback to Spark
}
...
}
Move the native children check from the DataWritingCommandExec case into convertToComet method using a requiresNativeChildren flag on CometOperatorSerde. This makes the check more generic so other operators can use it by overriding the flag. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rationale
Closes #2944
Closes #3215
Summary
Test plan
"native write falls back when scan produces non-Arrow data"in CometParquetWriterSuitenative_cometscan doesn't support complex types, the native write falls back to Spark instead of failing at runtime🤖 Generated with Claude Code