hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15267) S3A fails to store my data when multipart size is set ot 5 Mb and SSE-C encryption is enabled
Date Wed, 28 Feb 2018 14:38:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380386#comment-16380386

Steve Loughran commented on HADOOP-15267:

Thanks for finding this —I'll see if I can replicate it.

regarding your patch

# It'd be good to follow our naming convention of HADOOP-1234-001.patch; with new patches
going up by 1; yetus prefers this
# I'd prefer to see this in {{WriteOperationsHelper.newUploadPartRequest()}}, as that builds
up the request...you can make {{generateSSECustomerKey()}} package private to do this.

We need tests to stop regression. 

I propose
* a subclass of ITestS3AHugeFilesDiskBlocks, preferaby {{org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesArrayBlocks.ITestS3AHugeFilesSSECDiskBlocks}}.

* whose configuration setup sets SSE-C and the key, as done in {{ITestS3AEncryptionSSEC}}
* and in setup*(), after calling {{super.setup()) call {{S3ATestUtils.skipIfEncryptionTestsDisabled(getConfiguration());}}

then if you run the hadoop aws test suite with the scale tests turned on, this should do the
test run. If you add the test before adding the fix, that will show the test works, once the
fix goes in, we can see the fix takes.

Thanks for starting this...if we can turn this around quickly then it can go into 3.1

> S3A fails to store my data when multipart size is set ot 5 Mb and SSE-C encryption is
> ---------------------------------------------------------------------------------------------
>                 Key: HADOOP-15267
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15267
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.1.0
>         Environment: Hadoop 3.1 Snapshot
>            Reporter: Anis Elleuch
>            Priority: Critical
>         Attachments: hadoop-fix.patch
> When I enable SSE-C encryption in Hadoop 3.1 and set  fs.s3a.multipart.size to 5 Mb,
storing data in AWS doesn't work anymore. For example, running the following code:
> {code}
> >>> df1 = spark.read.json('/home/user/people.json')
> >>> df1.write.mode("overwrite").json("s3a://testbucket/people.json")
> {code}
> shows the following exception:
> {code:java}
> com.amazonaws.services.s3.model.AmazonS3Exception: The multipart upload initiate requested
encryption. Subsequent part requests must include the appropriate encryption parameters.
> {code}
> After some investigation, I discovered that hadoop-aws doesn't send SSE-C headers in
Put Object Part as stated in AWS specification: [https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html]
> {code:java}
> If you requested server-side encryption using a customer-provided encryption key in your
initiate multipart upload request, you must provide identical encryption information in each
part upload using the following headers.
> {code}
> You can find a patch attached to this issue for a better clarification of the problem.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message