Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/docs/flink-ddl.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,23 @@ The following properties can be set if using the Hive catalog:
* `hive-conf-dir`: Path to a directory containing a `hive-site.xml` configuration file which will be used to provide custom Hive configuration values. The value of `hive.metastore.warehouse.dir` from `<hive-conf-dir>/hive-site.xml` (or hive configure file from classpath) will be overwritten with the `warehouse` value if setting both `hive-conf-dir` and `warehouse` when creating iceberg catalog.
* `hadoop-conf-dir`: Path to a directory containing `core-site.xml` and `hdfs-site.xml` configuration files which will be used to provide custom Hadoop configuration values.

!!! warning "Hive Catalog Limitation"
The Hive Metastore validates schema changes by comparing column types **positionally**
(`hive.metastore.disallow.incompatible.col.type.changes`, default `true`). When using a Hive catalog,
schema evolution operations that change column positions — such as dropping a non-last column or
reordering columns — may fail regardless of which engine performs the change (Spark, Flink Java API, etc.).

To work around this, disable the Hive Metastore (HMS) schema compatibility check by setting
`hive.metastore.disallow.incompatible.col.type.changes=false`:

- **Remote HMS:** Set this property in the HMS server's `hive-site.xml`.
- **Embedded HMS:** Add the equivalent property to the Hive catalog configuration.
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Embedded HMS” workaround is unclear in the context of this Flink Hive catalog section (where uri is required and configuration is typically provided via hive-conf-dir/classpath). Consider rephrasing this to explicitly instruct users to set the property in a hive-site.xml picked up via hive-conf-dir (or classpath) rather than suggesting an “embedded” metastore configuration path that may not apply.

Suggested change
- **Embedded HMS:** Add the equivalent property to the Hive catalog configuration.
- **When configuring the Hive catalog in Flink:** Set this property in a `hive-site.xml`
that Flink picks up via `hive-conf-dir` or from the classpath.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Embedded HMS means using HMS with derby, which is launching in the Flink JM.


**Trade-off:** After disabling this check, the Hive engine may no longer be able to read the table
correctly due to the schema mismatch in the Hive Metastore. Iceberg-aware engines (Spark, Flink,
Trino, etc.) will continue to work correctly, as they read schema from Iceberg metadata rather
than the Hive Metastore.

#### Hadoop catalog

Iceberg also supports a directory-based catalog in HDFS that can be configured using `'catalog-type'='hadoop'`:
Expand Down
35 changes: 35 additions & 0 deletions docs/docs/spark-ddl.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,27 @@ Iceberg has full `ALTER TABLE` support in Spark 3, including:

In addition, [SQL extensions](spark-configuration.md#sql-extensions) can be used to add support for partition evolution and setting a table's write order

!!! warning "Hive Catalog Limitation"
The Hive Metastore (HMS) validates schema changes by comparing column types **positionally**
(`hive.metastore.disallow.incompatible.col.type.changes`, default `true`). Any schema evolution
operation that shifts column positions will fail when using a Hive catalog. Affected operations
include:

- `ADD COLUMN` with `FIRST` or `AFTER` clauses
- `ALTER COLUMN` with `FIRST` or `AFTER` clauses (reordering)
- `DROP COLUMN` on a non-last column

To work around this, disable the HMS schema compatibility check by setting
`hive.metastore.disallow.incompatible.col.type.changes=false`:

- **Remote HMS:** Set this property in the HMS server's `hive-site.xml`.
- **Embedded HMS:** Pass `--conf spark.hadoop.hive.metastore.disallow.incompatible.col.type.changes=false` when starting Spark.

**Trade-off:** After disabling this check, the Hive engine may no longer be able to read the table
correctly due to the schema mismatch in the Hive Metastore. Iceberg-aware engines (Spark, Flink,
Trino, etc.) will continue to work correctly, as they read schema from Iceberg metadata rather
than HMS.

### `ALTER TABLE ... RENAME TO`

```sql
Expand Down Expand Up @@ -259,6 +280,11 @@ ALTER TABLE prod.db.sample
ADD COLUMN nested.new_column bigint FIRST;
```

!!! warning "Hive Catalog Limitation"
When using a Hive catalog, adding a column with `FIRST` or `AFTER` may fail due to HMS positional
schema validation. See the warning above for details
and workaround.

### `ALTER TABLE ... RENAME COLUMN`

Iceberg allows any field to be renamed. To rename a field, use `RENAME COLUMN`:
Expand Down Expand Up @@ -302,6 +328,10 @@ ALTER TABLE prod.db.sample ALTER COLUMN col FIRST;
ALTER TABLE prod.db.sample ALTER COLUMN nested.col AFTER other_col;
```

!!! warning "Hive Catalog Limitation"
When using a Hive catalog, reordering columns may fail due to HMS positional schema validation.
See the Hive Catalog Limitation note above for details and workaround.

Nullability for a non-nullable column can be changed using `DROP NOT NULL`:

```sql
Expand All @@ -323,6 +353,11 @@ ALTER TABLE prod.db.sample DROP COLUMN id;
ALTER TABLE prod.db.sample DROP COLUMN point.z;
```

!!! warning "Hive Catalog Limitation"
When using a Hive catalog, dropping a non-last column may fail due to HMS positional schema
validation. See the earlier Hive Catalog Limitation warning above for details and
workaround.

## `ALTER TABLE` SQL extensions

These commands are available in Spark 3 when using Iceberg [SQL extensions](spark-configuration.md#sql-extensions).
Expand Down
Loading