-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-10633. Support Content-MD5 header for checking object integrity when uploading object #9612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Note:
|
|
Thanks @jojochuang for pointing this out. I agree that the final ETag may not always be an MD5 (especially for multipart or encrypted objects). However, this patch focuses on in-transit integrity checking via the During the upload, S3G calculates the MD5 of the incoming data stream to verify it against the client-provided |
# Conflicts: # hadoop-ozone/integration-test-s3/src/test/java/org/apache/hadoop/ozone/s3/awssdk/v2/AbstractS3SDKV2Tests.java # hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java
ivandika3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hevinhsu for the patch and the extensive tests. Overall LGTM. Just left a few comments.
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java
Outdated
Show resolved
Hide resolved
...gration-test-s3/src/test/java/org/apache/hadoop/ozone/s3/awssdk/v2/AbstractS3SDKV2Tests.java
Show resolved
Hide resolved
Yes, ETag can technically be any hash that based on the object data so it doesn't need to be MD5. Moreover, object uploaded using multipart upload ETag is a hash of all the ETag of all the parts with the "-<num_parts>" prefix, see If we decide in the future to change ETag to not be md5 hash, then we need to setup a new MD5 |
|
Thanks @ivandika3 for the reviews. I’ve fixed the merge-related issue you pointed out and addressed the nits, including renaming I agree with the rename. As you pointed out, ETag is not guaranteed to be an MD5 hash (especially for multipart uploads), so using |
| String md5Hash = keyDetails.getMetadata().get(OzoneConsts.ETAG); | ||
| if (md5Hash != null) { | ||
| responseBuilder.header(HttpHeaders.ETAG, wrapInQuotes(md5Hash)); | ||
| String partsCount = extractPartsCount(md5Hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This should still be ETag since we are actually getting the key ETag which may not be MD5. We should only use md5Hash if we are actually calculating MD5.
What changes were proposed in this pull request?
Add support for the
Content-MD5header to verify object integrity during object uploads.Please describe your PR in detail:
Content-MD5header using pre-commit hooks before the key is committed.Content-MD5Content-MD5KeyDataStreamOutputto simplify unit test setup.What is the link to the Apache JIRA?
https://issues.apache.org/jira/browse/HDDS-10633
How was this patch tested?
https://github.com/hevinhsu/ozone/actions/runs/20842334193