hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15267) S3A multipart upload fails when SSE-C encryption is enabled
Date Mon, 05 Mar 2018 11:57:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385986#comment-16385986
] 

Steve Loughran commented on HADOOP-15267:
-----------------------------------------

Checkstyle is line width == 82; need to look at that to see if the code looks better with
it, in which case we can ignore

Test failures are legitimate NPEs in the new code

{code}
[ERROR] testTaskMultiFileUploadFailure[0](org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter)
 Time elapsed: 0.14 s  <<< ERROR!
java.lang.NullPointerException
	at org.apache.hadoop.fs.s3a.S3AFileSystem.setOptionalUploadPartRequestParameters(S3AFileSystem.java:2610)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1567)
	at org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
	at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:314)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231)
	at org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:123)
	at org.apache.hadoop.fs.s3a.WriteOperationHelper.uploadPart(WriteOperationHelper.java:471)
	at org.apache.hadoop.fs.s3a.commit.CommitOperations.uploadFileToPendingCommit(CommitOperations.java:477)
	at org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.lambda$commitTaskInternal$4(StagingCommitter.java:698)
	at org.apache.hadoop.fs.s3a.commit.Tasks$Builder.runSingleThreaded(Tasks.java:165)
	at org.apache.hadoop.fs.s3a.commit.Tasks$Builder.run(Tasks.java:150)
	at org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.commitTaskInternal(StagingCommitter.java:690)
	at org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter.commitTask(StagingCommitter.java:635)
	at org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter.lambda$testTaskMultiFileUploadFailure$3(TestStagingCommitter.java:427)
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:491)
	at org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter.testTaskMultiFileUploadFailure(TestStagingCommitter.java:423)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}

[~vadmeste]: I need to draw your attention to the hadoop-aws [patch submission policy|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md].
Nobodies patches get reviewed until the submitter declares which s3 endpoint they've run all
the hadoop-aws integration tests against; Jenkins only runs the unit tests.

Here it's one of the mock tests, so while the patch may work in production, the mock S3FS
may need some tweaks to handle the setup, which means {{MockS3AFileSystem}} is going to need
some attention. 

This is what I suggest
# ignore my recommendation to move the change into {{WriteOperationsHelper.newUploadPartRequest()}},
as that's running outside the FS...you'd need to add more entry points into S3AFileSystem
and wire up.
# make {{setOptionalUploadPartRequestParameters}} protected, add javadocs, etc.
# in {{MockS3AFileSystem}}, make it a no-op.
# ...after that the failing tests should work...
# then its time to worry about the integration tests.

This is an important patch, it is ready to go in apart from those tests, but yes, we need
the text fixup & something new to verify the problem is not only fixed, but never going
to come back. 




> S3A multipart upload fails when SSE-C encryption is enabled
> -----------------------------------------------------------
>
>                 Key: HADOOP-15267
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15267
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>         Environment: Hadoop 3.1 Snapshot
>            Reporter: Anis Elleuch
>            Assignee: Anis Elleuch
>            Priority: Critical
>         Attachments: hadoop-fix.patch
>
>
> When I enable SSE-C encryption in Hadoop 3.1 and set  fs.s3a.multipart.size to 5 Mb,
storing data in AWS doesn't work anymore. For example, running the following code:
> {code}
> >>> df1 = spark.read.json('/home/user/people.json')
> >>> df1.write.mode("overwrite").json("s3a://testbucket/people.json")
> {code}
> shows the following exception:
> {code:java}
> com.amazonaws.services.s3.model.AmazonS3Exception: The multipart upload initiate requested
encryption. Subsequent part requests must include the appropriate encryption parameters.
> {code}
> After some investigation, I discovered that hadoop-aws doesn't send SSE-C headers in
Put Object Part as stated in AWS specification: [https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html]
> {code:java}
> If you requested server-side encryption using a customer-provided encryption key in your
initiate multipart upload request, you must provide identical encryption information in each
part upload using the following headers.
> {code}
>  
> You can find a patch attached to this issue for a better clarification of the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message