Skip to content

[#11059] feat(iceberg-rest): Support overwrite on registerTable endpoint#11809

Merged
roryqi merged 7 commits into
apache:mainfrom
lasdf1234:fix/irc-register-table-overwrite
Jul 1, 2026
Merged

[#11059] feat(iceberg-rest): Support overwrite on registerTable endpoint#11809
roryqi merged 7 commits into
apache:mainfrom
lasdf1234:fix/irc-register-table-overwrite

Conversation

@lasdf1234

@lasdf1234 lasdf1234 commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

What changes were proposed in this pull request?

Implement Iceberg REST POST .../namespaces/{ns}/register support for RegisterTableRequest.overwrite=true:

  • Add RegisterTableOverwrite with CAS metastore pointer swap for JDBC, memory, and Hive backends (does not write a new metadata file via TableOperations.commit).
  • Wire 3-arg registerTable(..., overwrite) through JdbcCatalogWithMetadataLocationSupport, MemoryCatalogWithMetadataLocationSupport, and HiveCatalogWithMetadataLocationSupport.
  • Pass request.overwrite() through FederatedCatalogWrapper (federation path previously ignored the flag).
  • In IcebergNamespaceHookDispatcher, skip Gravitino entity import and ownership assignment when overwrite=true so existing table_id and role/owner/tag/policy bindings are preserved.

Why are the changes needed?

Fix: #11059

Without overwrite=true, re-registering an existing table requires drop + register, which is non-atomic and can mint a new Gravitino table entity, orphaning bindings keyed on the old table_id.

Does this PR introduce any user-facing change?

Yes. IRC clients can now pass overwrite=true on registerTable to atomically repoint an existing table registration at a new metadata.json location.

How was this patch tested?

  • ./gradlew :iceberg:iceberg-common:test --tests org.apache.iceberg.jdbc.TestJdbcCatalogWithMetadataLocationSupport --tests org.apache.iceberg.memory.TestMemoryCatalogWithMetadataLocationSupport -PskipITs
  • ./gradlew :iceberg:iceberg-rest-server:test --tests org.apache.gravitino.iceberg.service.TestCatalogWrapperForREST --tests org.apache.gravitino.iceberg.service.dispatcher.TestIcebergNamespaceHookDispatcher -PskipITs

… endpoint

Implement registerTable(..., overwrite=true) for JDBC, memory, and Hive
backends via metastore pointer swap, pass overwrite through federation,
and skip Gravitino entity import/ownership on overwrite to preserve
table_id and existing bindings.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lasdf1234 lasdf1234 self-assigned this Jun 26, 2026
…write into MetastoreRegisterTableUtils

Replace RegisterTableOverwrite and per-backend PointerSwap classes with shared
MetastoreRegisterTableUtils and backend-specific overwriteMetadataLocation hooks.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

Code Coverage Report

Overall Project 67.23% -0.12% 🟢
Files changed 60.14% 🟢

Module Coverage
aliyun 1.72% 🔴
api 46.82% 🟢
authorization-common 85.96% 🟢
aws 26.5% 🔴
azure 2.47% 🔴
catalog-common 10.4% 🔴
catalog-fileset 80.23% 🟢
catalog-glue 66.91% 🟢
catalog-hive 79.42% 🟢
catalog-jdbc-clickhouse 80.02% +4.04% 🟢
catalog-jdbc-common 44.22% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 80.91% 🟢
catalog-jdbc-postgresql 82.29% 🟢
catalog-jdbc-starrocks 78.51% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 58.53% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 85.86% 🟢
catalog-lakehouse-paimon 82.14% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 78.01% 🟢
common 50.17% 🟢
core 82.59% 🟢
filesystem-hadoop3 77.3% 🟢
flink 0.0% 🔴
flink-common 47.12% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.88% 🔴
hive-metastore-common 53.77% 🟢
iceberg-common 56.31% -1.7% 🟢
iceberg-rest-server 73.94% +1.08% 🟢
idp-basic 85.71% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 20.81% 🔴
lance-rest-server 64.84% 🟢
lineage 53.02% 🟢
optimizer 83.17% 🟢
optimizer-api 21.95% 🔴
server 85.96% 🟢
server-common 74.18% 🟢
spark 28.57% 🔴
spark-common 41.66% -9.31% 🟢
trino-connector 40.29% 🟢
Files
Module File Coverage
catalog-jdbc-clickhouse ClickHouseTableOperations.java 83.57% 🟢
iceberg-common MetastoreRegisterTableUtils.java 95.24% 🟢
MemoryCatalogWithMetadataLocationSupport.java 81.25% 🟢
JdbcCatalogWithMetadataLocationSupport.java 73.68% 🟢
HiveCatalogWithMetadataLocationSupport.java 11.25% 🔴
iceberg-rest-server IcebergNamespaceHookDispatcher.java 90.54% 🟢
FederatedCatalogWrapper.java 87.02% 🟢
spark-common GravitinoDriverPlugin.java 15.15% 🔴
BaseCatalog.java 2.73% 🔴

@lasdf1234 lasdf1234 marked this pull request as ready for review June 28, 2026 12:49
@lasdf1234 lasdf1234 marked this pull request as draft June 28, 2026 12:49
Align metastore overwrite helpers with Iceberg naming, inline Hive HMS overwrite logic, and add IcebergRESTHiveCatalogIT for overwrite on Hive backend.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lasdf1234 lasdf1234 marked this pull request as ready for review June 28, 2026 14:54
@lasdf1234 lasdf1234 requested a review from roryqi June 29, 2026 08:22
@lasdf1234

Copy link
Copy Markdown
Collaborator Author

@roryqi Could help me to review this code? Thans.

@roryqi

roryqi commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Maybe we can use the CatalogHandlers.registerTable(overwrite = true) directly.

@lasdf1234

Copy link
Copy Markdown
Collaborator Author

Maybe we can use the CatalogHandlers.registerTable(overwrite = true) directly.

The CatalogWrapperForREST meets the requirements. The FederatedCatalogWrapper has already modified the code.

@roryqi

roryqi commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Maybe we can use the CatalogHandlers.registerTable(overwrite = true) directly.

The CatalogWrapperForREST meets the requirements. The FederatedCatalogWrapper has already modified the code.

You can only change the FederatedCatalogWrapper code.

Use CatalogHandlers.registerTable for overwrite propagation and request validation, then loadTable to build the federation FileIO response.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lasdf1234

Copy link
Copy Markdown
Collaborator Author

Maybe we can use the CatalogHandlers.registerTable(overwrite = true) directly.

The CatalogWrapperForREST meets the requirements. The FederatedCatalogWrapper has already modified the code.

You can only change the FederatedCatalogWrapper code.

Got code has been modified.

@lasdf1234

Copy link
Copy Markdown
Collaborator Author

@the code has been revised. If you have time, please review it. Thank you.

@lasdf1234 lasdf1234 added the branch-1.3 Automatically cherry-pick commit to branch-1.3 label Jun 30, 2026
if (registerTableRequest.overwrite()) {
// overwrite=true updates the Iceberg metadata pointer in place. The Gravitino table entity,
// table_id, and existing role/owner/tag/policy bindings must be preserved.
return response;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to handle this case if table doesn't exist and overwrite is true?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got resolved.

ops instanceof BaseMetastoreTableOperations,
"Cannot register table %s: table operations are not metastore-backed",
identifier);
BaseMetastoreTableOperations metastoreOps = (BaseMetastoreTableOperations) ops;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should u use TableOperations to commit new location here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the commit method will generate a new metadata.json file.
Overwrite should be avoided so that no new metadata.json file is created.

@roryqi

roryqi commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Why do u add the label 1.3?

@lasdf1234 lasdf1234 removed the branch-1.3 Automatically cherry-pick commit to branch-1.3 label Jun 30, 2026
… when missing

Only skip Gravitino hooks on overwrite when the table entity already exists,
so Iceberg-only tables still get imported and assigned an owner.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lasdf1234 lasdf1234 requested a review from roryqi June 30, 2026 08:39
@lasdf1234

Copy link
Copy Markdown
Collaborator Author

Why do u add the label 1.3?

Got remove it


if (registerTableRequest.overwrite()) {
TableDispatcher tableDispatcher = GravitinoEnv.getInstance().internalTableDispatcher();
if (tableDispatcher != null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could u avoid using tableDispatcher? You can use Iceberg related classes.
exists will always return true.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got. Use EntityStore to check if table exists.

…y exists

Use EntityStore to detect an existing table entity instead of tableDispatcher
or Iceberg backend tableExists, and skip import/owner only when Gravitino
already tracks the table regardless of overwrite flag.

Co-authored-by: Cursor <cursoragent@cursor.com>
@lasdf1234 lasdf1234 requested a review from roryqi June 30, 2026 11:14

@roryqi roryqi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@roryqi roryqi merged commit 6ce2b61 into apache:main Jul 1, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Support overwrite on Iceberg REST registerTable endpoint

2 participants