[#11059] feat(iceberg-rest): Support overwrite on registerTable endpoint#11809
Conversation
… endpoint Implement registerTable(..., overwrite=true) for JDBC, memory, and Hive backends via metastore pointer swap, pass overwrite through federation, and skip Gravitino entity import/ownership on overwrite to preserve table_id and existing bindings. Co-authored-by: Cursor <cursoragent@cursor.com>
…write into MetastoreRegisterTableUtils Replace RegisterTableOverwrite and per-backend PointerSwap classes with shared MetastoreRegisterTableUtils and backend-specific overwriteMetadataLocation hooks. Co-authored-by: Cursor <cursoragent@cursor.com>
Code Coverage Report
Files
|
Align metastore overwrite helpers with Iceberg naming, inline Hive HMS overwrite logic, and add IcebergRESTHiveCatalogIT for overwrite on Hive backend. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@roryqi Could help me to review this code? Thans. |
|
Maybe we can use the CatalogHandlers.registerTable(overwrite = true) directly. |
The CatalogWrapperForREST meets the requirements. The FederatedCatalogWrapper has already modified the code. |
You can only change the FederatedCatalogWrapper code. |
Use CatalogHandlers.registerTable for overwrite propagation and request validation, then loadTable to build the federation FileIO response. Co-authored-by: Cursor <cursoragent@cursor.com>
Got code has been modified. |
|
@the code has been revised. If you have time, please review it. Thank you. |
| if (registerTableRequest.overwrite()) { | ||
| // overwrite=true updates the Iceberg metadata pointer in place. The Gravitino table entity, | ||
| // table_id, and existing role/owner/tag/policy bindings must be preserved. | ||
| return response; |
There was a problem hiding this comment.
How to handle this case if table doesn't exist and overwrite is true?
| ops instanceof BaseMetastoreTableOperations, | ||
| "Cannot register table %s: table operations are not metastore-backed", | ||
| identifier); | ||
| BaseMetastoreTableOperations metastoreOps = (BaseMetastoreTableOperations) ops; |
There was a problem hiding this comment.
Should u use TableOperations to commit new location here?
There was a problem hiding this comment.
Using the commit method will generate a new metadata.json file.
Overwrite should be avoided so that no new metadata.json file is created.
|
Why do u add the label |
… when missing Only skip Gravitino hooks on overwrite when the table entity already exists, so Iceberg-only tables still get imported and assigned an owner. Co-authored-by: Cursor <cursoragent@cursor.com>
Got remove it |
|
|
||
| if (registerTableRequest.overwrite()) { | ||
| TableDispatcher tableDispatcher = GravitinoEnv.getInstance().internalTableDispatcher(); | ||
| if (tableDispatcher != null |
There was a problem hiding this comment.
Could u avoid using tableDispatcher? You can use Iceberg related classes.
exists will always return true.
There was a problem hiding this comment.
Got. Use EntityStore to check if table exists.
…y exists Use EntityStore to detect an existing table entity instead of tableDispatcher or Iceberg backend tableExists, and skip import/owner only when Gravitino already tracks the table regardless of overwrite flag. Co-authored-by: Cursor <cursoragent@cursor.com>
What changes were proposed in this pull request?
Implement Iceberg REST
POST .../namespaces/{ns}/registersupport forRegisterTableRequest.overwrite=true:RegisterTableOverwritewith CAS metastore pointer swap for JDBC, memory, and Hive backends (does not write a new metadata file viaTableOperations.commit).registerTable(..., overwrite)throughJdbcCatalogWithMetadataLocationSupport,MemoryCatalogWithMetadataLocationSupport, andHiveCatalogWithMetadataLocationSupport.request.overwrite()throughFederatedCatalogWrapper(federation path previously ignored the flag).IcebergNamespaceHookDispatcher, skip Gravitino entity import and ownership assignment whenoverwrite=trueso existingtable_idand role/owner/tag/policy bindings are preserved.Why are the changes needed?
Fix: #11059
Without
overwrite=true, re-registering an existing table requires drop + register, which is non-atomic and can mint a new Gravitino table entity, orphaning bindings keyed on the oldtable_id.Does this PR introduce any user-facing change?
Yes. IRC clients can now pass
overwrite=trueon registerTable to atomically repoint an existing table registration at a new metadata.json location.How was this patch tested?
./gradlew :iceberg:iceberg-common:test --tests org.apache.iceberg.jdbc.TestJdbcCatalogWithMetadataLocationSupport --tests org.apache.iceberg.memory.TestMemoryCatalogWithMetadataLocationSupport -PskipITs./gradlew :iceberg:iceberg-rest-server:test --tests org.apache.gravitino.iceberg.service.TestCatalogWrapperForREST --tests org.apache.gravitino.iceberg.service.dispatcher.TestIcebergNamespaceHookDispatcher -PskipITs