hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14107) ITestS3GuardListConsistency fails intermittently
Date Tue, 04 Apr 2017 23:07:41 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956028#comment-15956028

Mingliang Liu commented on HADOOP-14107:

[~stevel@apache.org], thanks for your input.

I reran the test but did not reproduce the failure. There are two possibilities that may fail
this test:
# In {{noWriteBack.listStatus(directory);}}, the directory on S3 only may not be listed because
of the eventually consistency in S3 list. S3Guard is not guaranteed to discover new S3 objects
in a timely manner. This is possible but I did not reproduce this in test.
# If the test data was not cleaned up during last test process, then this test will fail,
as the stack trace shows in description section.

The #1 is a corner case and we can simply sleep for 3~5 seconds. The #2 can be addressed by
deleting the existing test directory explicitly. We may still need to sleep for a while before
delete tracking is supported. Meanwhile, per your comment, I changed the coding style, comments,
variable names etc in the v0 patch to make the test purpose clearer.

The major point is that, is the test doing the right thing? I think it is. It creates a directory
on S3 only (via noS3Guard), and another directory on S3 and DDB aka metadata store (via noWriteBack).
Then if we list the file system with S3Guard enabled, regardless of S3Guard writing back or
not, it should return both of the data. The only exception is the point #1 above: if the data
only in S3 is not visible yet. That's being said, both noWriteBack and yesWriteBack will discover
S3-only object, and union that with DDB result. However, noWriteBack should not populate the
newly found S3-only object to DDB while yesWriteBack should always populate the S3-only object
to DDB. We assert that via querying DDB directly.

> ITestS3GuardListConsistency fails intermittently
> ------------------------------------------------
>                 Key: HADOOP-14107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14107
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-14107-HADOOP-13345.000.patch
> {code}
> mvn -Dit.test='ITestS3GuardListConsistency' -Dtest=none -Dscale -Ds3guard -Ddynamo -q
clean verify
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.544 sec <<<
FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
> testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time elapsed:
3.147 sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of results from metastore. Metastore should
only know about /XYZ: DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack,
isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx;
isSymlink=false} isEmptyDirectory=true}, s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123;
isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; permission=rwxrwxrwx;
isSymlink=false} isEmptyDirectory=true}}, isAuthoritative=false}
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.assertTrue(Assert.java:41)
> 	at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
> {code}
> See discussion on the parent JIRA [HADOOP-13345].

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message