-
Notifications
You must be signed in to change notification settings - Fork 584
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Description
Now the validation metadata config(spark.gluten.sql.fallbackUnexpectedMetadataParquet) is default false, if set to true, for each root path, we check the file limit (spark.gluten.sql.fallbackUnexpectedMetadataParquet.limit), if the number of partitions are too much, the validation will be expensive.
The possible solution is to sample the rootPaths to select some files.
The sample file limit should be decided by file total limit number and the total file number in root paths, the latter should be decided by the percentage.
Gluten version
None
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request