hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints
Date Fri, 03 Feb 2017 17:31:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851781#comment-15851781
] 

Steve Loughran commented on HADOOP-13786:
-----------------------------------------

About to add a new patch, which does more in terms of testing, though there's some ambiguity
about the semantics of commit and abort that I need to clarify with the MR Team.

h2. This code works without s3guard, but fails (differently) with s3guard local and s3guard
dynamo

s3guard DDDB
{code}

Running org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter
Tests run: 9, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 94.915 sec <<< FAILURE!
- in org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter
testAbort(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time elapsed: 9.292 sec
 <<< FAILURE!
java.lang.AssertionError: Output directory not empty ls s3a://hwdev-steve-ireland-new/test/testAbort
[00] S3AFileStatus{path=s3a://hwdev-steve-ireland-new/test/testAbort/part-m-00000; isDirectory=false;
length=40; replication=1; blocksize=33554432; modification_time=1486142401246; access_time=0;
owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false
: array lengths differed, expected.length=0 actual.length=1
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.internal.ComparisonCriteria.assertArraysAreSameLength(ComparisonCriteria.java:71)
        at org.junit.internal.ComparisonCriteria.arrayEquals(ComparisonCriteria.java:32)
        at org.junit.Assert.internalArrayEquals(Assert.java:473)
        at org.junit.Assert.assertArrayEquals(Assert.java:265)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testAbort(ITestS3AOutputCommitter.java:561)

testFailAbort(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time elapsed: 9.453
sec  <<< FAILURE!
java.lang.AssertionError: expected output file: unexpectedly found s3a://hwdev-steve-ireland-new/test/testFailAbort/part-m-00000
as  S3AFileStatus{path=s3a://hwdev-steve-ireland-new/test/testFailAbort/part-m-00000; isDirectory=false;
length=40; replication=1; blocksize=33554432; modification_time=1486142441390; access_time=0;
owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false
        at org.junit.Assert.fail(Assert.java:88)
        at org.apache.hadoop.fs.contract.ContractTestUtils.assertPathDoesNotExist(ContractTestUtils.java:796)
        at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertPathDoesNotExist(AbstractFSContractTestBase.java:305)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testFailAbort(ITestS3AOutputCommitter.java:587)


Results :

Failed tests: 
  ITestS3AOutputCommitter.testAbort:561->Assert.assertArrayEquals:265->Assert.internalArrayEquals:473->Assert.fail:88
Output directory not empty ls s3a://hwdev-steve-ireland-new/test/testAbort [00] S3AFileStatus{path=s3a://hwdev-steve-ireland-new/test/testAbort/part-m-00000;
isDirectory=false; length=40; replication=1; blocksize=33554432; modification_time=1486142401246;
access_time=0; owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false
: array lengths differed, expected.length=0 actual.length=1
  ITestS3AOutputCommitter.testFailAbort:587->AbstractFSContractTestBase.assertPathDoesNotExist:305->Assert.fail:88
expected output file: unexpectedly found s3a://hwdev-steve-ireland-new/test/testFailAbort/part-m-00000
as  S3AFileStatus{path=s3a://hwdev-steve-ireland-new/test/testFailAbort/part-m-00000; isDirectory=false;
length=40; replication=1; blocksize=33554432; modification_time=1486142441390; access_time=0;
owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false

Tests run: 9, Failures: 2, Errors: 0, Skipped: 0


{code}


s3guard local DB

{code}
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter
Tests run: 9, Failures: 2, Errors: 2, Skipped: 0, Time elapsed: 53.226 sec <<< FAILURE!
- in org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter
testMapFileOutputCommitter(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time
elapsed: 9.394 sec  <<< FAILURE!
java.lang.AssertionError: Number of MapFile.Reader entries in s3a://hwdev-steve-ireland-new/test/testMapFileOutputCommitter
: ls s3a://hwdev-steve-ireland-new/test/testMapFileOutputCommitter [00] S3AFileStatus{path=s3a://hwdev-steve-ireland-new/test/testMapFileOutputCommitter/_SUCCESS;
isDirectory=false; length=0; replication=1; blocksize=33554432; modification_time=1486142538000;
access_time=0; owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false} isEmptyDirectory=false
 expected:<1> but was:<0>
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.failNotEquals(Assert.java:743)
        at org.junit.Assert.assertEquals(Assert.java:118)
        at org.junit.Assert.assertEquals(Assert.java:555)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testMapFileOutputCommitter(ITestS3AOutputCommitter.java:500)

testAbort(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time elapsed: 4.307 sec
 <<< ERROR!
java.io.FileNotFoundException: No such file or directory: s3a://hwdev-steve-ireland-new/test/testAbort
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1765)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1480)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1456)
        at org.apache.hadoop.fs.contract.ContractTestUtils.listChildren(ContractTestUtils.java:427)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testAbort(ITestS3AOutputCommitter.java:560)

testCommitterWithFailure(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time elapsed:
5.375 sec  <<< FAILURE!
java.lang.AssertionError: Expected an exception
        at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:374)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.expectFNFEonJobCommit(ITestS3AOutputCommitter.java:396)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testCommitterWithFailure(ITestS3AOutputCommitter.java:390)

testFailAbort(org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter)  Time elapsed: 5.016
sec  <<< ERROR!
java.io.FileNotFoundException: expected output dir: not found s3a://hwdev-steve-ireland-new/test/testFailAbort
in s3a://hwdev-steve-ireland-new/test
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1765)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:125)
        at org.apache.hadoop.fs.contract.ContractTestUtils.verifyPathExists(ContractTestUtils.java:773)
        at org.apache.hadoop.fs.contract.ContractTestUtils.assertPathExists(ContractTestUtils.java:757)
        at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertPathExists(AbstractFSContractTestBase.java:294)
        at org.apache.hadoop.fs.s3a.commit.ITestS3AOutputCommitter.testFailAbort(ITestS3AOutputCommitter.java:586)

{code}

I don't know what's up here, but I do know that (a) we're doing too many deletes during the
commit process, more specifically: creating too many mock empty dirs. That can be optimised
with a "delete don't care about a parent dir" method in writeOperationHelper. 

In the first s3guard local test, {{testMapFileOutputCommitter}} all is well in the FS used
by the job, but when a new FS instance is created in the same process, the second one isn't
seeing the listing.

Not looked at the other failures in any detail.

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch,
HADOOP-13786-HADOOP-13345-003.patch, HADOOP-13786-HADOOP-13345-004.patch
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the presence of failures".
Implement it, including whatever is needed to demonstrate the correctness of the algorithm.
(that is, assuming that s3guard provides a consistent view of the presence/absence of blobs,
show that we can commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie.
not visible until the close()), if we need to use that to allow us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message