hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13449) S3Guard: Implement DynamoDBMetadataStore.
Date Wed, 07 Dec 2016 00:45:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15727226#comment-15727226

Aaron Fabbri commented on HADOOP-13449:

This looks pretty good [~liuml07], thank you.  A couple of comments:

1. For put(), in your loop which walks the path to the root, should we should break as soon
as we find a non-null ancestor?  I see you have a comment suggesting that.

This should be correct, by induction, since the invariant for the DDB MetadataStore (MS) 

- For any path P in MS, all ancestors p_i are in MS.

Thus, when we encounter an ancestor p_i on the way to the root, we know that all its ancestors
must already exist.

2. The isEmpty bit on directory FileStatus's.  

  public PathMetadata get(Path path) throws IOException {
  ... (snip) ...
        // for directory, we query its direct children to determine isEmpty bit

I think this misses the case where we have not seen all the children yet.  For example, we
have an existing bucket with some directory {{/a/b/}} and two existing files {{/a/b/file1}},
{{/a/b/file2}}.  We create a new MetadataStore and put(/a/b/), then get(/a/b).

That said, we should consider merging this patch and creating follow-up JIRAs since it is
good to split up the work and keep the code integrated.

Also, we should consider reworking the isEmptyDirectory logic around S3AFileStatus as we discussed
before.  Meanwhile, I'd be happy to implement a similar algorithm as to what I used for LocalMetadataStore
if that would be helpful.

Testing I did on this patch:
mvn clean verify -Dit.test="ITestS3A*"
Failed tests:
  ITestS3ACredentialsInURL.testInvalidCredentialsFail:130->Assert.fail:88 Expected an AccessDeniedException,
got S3AFileStatus{path=s3a://fabbri-dev/; isDirectory=true; modification_time=0; access_time=0;
owner=fabbri; group=fabbri; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false
after rename(srcFilePath, destFilePath): directories_created expected:<1> but was:<0>

Tests in error:
  ITestS3AAWSCredentialsProvider.testAnonymousProvider:133 » AWSServiceIO initia...
  ITestS3AAWSCredentialsProvider.testBadCredentials:102->createFailingFS:76 » AWSServiceIO
  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » AWSClientIO initializing ...
» AWSServiceIO

Tests run: 321, Failures: 2, Errors: 4, Skipped: 42

To summarize:
- isEmptyDirectory logic for DynamoDBMetadataStore (needs JIRA--unless I'm missing something)
- Anonymous credentials issue (needs JIRA)
- Issue with invalid creds in URI.  Creds in URI in general may not be honored by DDB MetadataStore?
(needs JIRA & investigation)
- Stats not showing directory creation from rename():   ITestS3AFileOperationCost.testFakeDirectoryDeletion
(needs investigation)

> S3Guard: Implement DynamoDBMetadataStore.
> -----------------------------------------
>                 Key: HADOOP-13449
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13449
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13449-HADOOP-13345.000.patch, HADOOP-13449-HADOOP-13345.001.patch,
HADOOP-13449-HADOOP-13345.002.patch, HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch,
HADOOP-13449-HADOOP-13345.005.patch, HADOOP-13449-HADOOP-13345.006.patch, HADOOP-13449-HADOOP-13345.007.patch,
HADOOP-13449-HADOOP-13345.008.patch, HADOOP-13449-HADOOP-13345.009.patch, HADOOP-13449-HADOOP-13345.010.patch,
> Provide an implementation of the metadata store backed by DynamoDB.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message