Core, API, Spec: Extend partition statistics support to unpartitioned tables#15896
Core, API, Spec: Extend partition statistics support to unpartitioned tables#15896ebyhr wants to merge 2 commits intoapache:mainfrom
Conversation
9bca81e to
0dd88d5
Compare
0dd88d5 to
da57f42
Compare
da57f42 to
1cda0e7
Compare
format/spec.md
Outdated
| | v1 | v2 | v3 | Field id, name | Type | Description | | ||
| |----|----|----|----------------|------|-------------| | ||
| | _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table | | ||
| | _required_ | _required_ | _required_ | **`1 partition`** | `struct<..>` | Partition data tuple, schema based on the unified partition type considering all specs in a table, empty for unpartitioned tables | |
There was a problem hiding this comment.
I'm happy to post this change to the mailing list if needed.
There was a problem hiding this comment.
yes. any spec change has to go for vote.
And whats the use case? You want to know data files, delete files, file size for unpartitioned table? We can use metadata tables for that Or maybe the new v4 manifests can solve it (with less I/O)?
Also, since the name itself is partitions stats. It makes odd to generate the file for unpartitioned table.
There was a problem hiding this comment.
@ajantha-bhat Thanks, I'll post after exploring another approach, since Parquet writer can't write an empty struct.
We want to obtain the record count more quickly and efficiently for CBO. As mentioned in the PR description, Trino currently needs to read manifest files to estimate table size.
Even if v4 manifests address this, migration would take a long time. We need a solution that works for tables with format versions < v4 as well.
There was a problem hiding this comment.
It is still a behavior change for existing users.
Partition stats was computed only for partition tables. Now you are enabling for non-partition tables with NULL partition. Plus like I mentioned, it is still odd for non partition table to write this file as it is via compute_partition_stats method. Why does user has to call compute_partition_stats on non-partition table?
I can understand that you need a quick table level stats for CBO in Trino without doing I/O of all the manifests.
a) Can we introduce a new table level stats (along with NDV puffin file) in compute_table_stats? People can still refer it for both partition table and non-partition table if they need whole table level info?
b) Or Can we check if snapshot summary already has this table level stats you are looking for? (we don't have to do multiple IO of the files in that case). If not, can we enhance snapshot summary to include it?
1cda0e7 to
1a0c198
Compare
|
This PR ran into a known issue in the Parquet writer, which cannot handle writing an empty struct. https://issues.apache.org/jira/browse/SPARK-20593 I'm going to change the value to NULL. |
e6578a4 to
638ab9f
Compare
ajantha-bhat
left a comment
There was a problem hiding this comment.
I think we should explore other options as It is still a behavior change for existing users.
Partition stats was computed only for partition tables. Now you are enabling for non-partition tables with NULL partition. Plus like I mentioned, it is still odd for non partition table to write this file as it is via 'compute_partition_stats' method. Why does user has to call compute_partition_stats on non-partition table?
I can understand that you need a quick table level stats for CBO in Trino without doing I/O of all the manifests.
a) Can we introduce a new table level stats (along with NDV puffin file) in compute_table_stats? People can still refer it for both partition table and non-partition table if they need whole table level info?
b) Or Can we check if snapshot summary already has this table level stats you are looking for? (we don't have to do multiple IO of the files in that case). If not, can we enhance snapshot summary to include it?
Proposed Change
Currently, the partition statistics file is only supported for partitioned tables.
iceberg/core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java
Line 207 in b987e60
While this design makes sense since it’s conceptually a “partition” statistics file, the information it contains is also valuable for unpartitioned tables.
For example, the Trino Iceberg connector currently needs to read manifest files (TableStatisticsReader) to build internal table statistics. If Iceberg provided a statistics file for unpartitioned tables as well, Trino (and other query engines) could leverage it directly, improving planning performance by avoiding expensive manifest reads.