hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14041) CLI command to prune old metadata
Date Thu, 16 Feb 2017 21:06:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870696#comment-15870696

Aaron Fabbri commented on HADOOP-14041:

Just recording results from my test runs last night:
mvn clean verify -Ds3guard -Ddynamo -Dscale
Tests run: 366, Failures: 3, Errors: 2, Skipped: 70

Failed tests:
(1)  ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRecursiveRootListing:222->Assert.assertTrue:41->Assert.fail:88
files mismatch:   "s3a://fabbri-dev/user/fabbri/test/file"  "s3a://fabbri-dev/user/fabbri/test/parentdir/child"
(2)  ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive:95->Assert.fail:88
After 1 attempts: listing after rm /* not empty
final [00] S3AFileStatus{path=s3a://fabbri-dev/Users; isDirectory=true; modification_time=0;
access_time=0; owner=fabbri; group=fabbri; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=false
(3)  ITestS3AContractRootDir.testListEmptyRootDirectory:63->AbstractContractRootDirectoryTest.testListEmptyRootDirectory:186->Assert.fail:88
Deleted file: unexpectedly found s3a://fabbri-dev/user as  S3AFileStatus{path=s3a://fabbri-dev/user;
isDirectory=true; modification_time=0; access_time=0; owner=fabbri; group=fabbri; permission=rwxrwxrwx;
isSymlink=false} isEmptyDirectory=false

Tests in error:
(4)  ITestS3ACredentialsInURL.testInstantiateFromURL:86 » InterruptedIO initTable: ...
(5)  ITestS3GuardToolDynamoDB.testDestroyDynamoDBMetadataStore:145 » IO S3Guard tab...

1-3 are root directory test failures which have been flaky.. one is leftover files from FileSystemContractBaseTest,
the other two are something creating a user/ directory while test is running? 

4 is expected: s3guard will not use URI credentials.  (We should skip this if we don't already
do that in pending patch)
5 is this: S3Guard table lacks version marker. Table: destroyDynamoDBMetadataStore-1546206104
        at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.verifyVersionCompatibility(DynamoDBMetadataStore.java:667)
        at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initTable(DynamoDBMetadataStore.java:630)
        at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:288)

I don't think any of these are related, except maybe the last one?

As for testing the prune command itself, the first thing I notice is that it behaves a bit
differently than, say, diff.  Diff appears to use bucket name as table name if one is not
set, but prune requires setting the table name.

$ hadoop s3a prune -H 1 s3a://fabbri-bucket
No DynamoDB table name configured!

> CLI command to prune old metadata
> ---------------------------------
>                 Key: HADOOP-14041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14041-HADOOP-13345.001.patch, HADOOP-14041-HADOOP-13345.002.patch,
HADOOP-14041-HADOOP-13345.003.patch, HADOOP-14041-HADOOP-13345.004.patch, HADOOP-14041-HADOOP-13345.005.patch
> Add a CLI command that allows users to specify an age at which to prune metadata that
hasn't been modified for an extended period of time. Since the primary use-case targeted at
the moment is list consistency, it would make sense (especially when authoritative=false)
to prune metadata that is expected to have become consistent a long time ago.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message