hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-14457) create() does not notify metadataStore of parent directories or ensure they're not existing files
Date Fri, 07 Jul 2017 02:15:01 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Mackrory updated HADOOP-14457:
-----------------------------------
    Attachment: HADOOP-14457-HADOOP-13345.010.patch

So attaching a patch that follows the design discussed by [~fabbri] and I in our last round
of comments. This still requires that clients pass in the full list of directories that need
to be created, and still has DynamoDB enforce that itself by ensuring that that set also includes
the parents of everything in the set. Note that it does this in-memory, without regard for
what already may exist in the database. It then writes it all in one batch. This is safe for
now, since directories either exist or they don't and there isn't other metadata we store
with them, but if we do start storing metadata (like authoritative bits) on each directory,
I see no other option but to go back to doing it 1 level at a time, probably with a round-trip
each (unless we can somehow request the entire lineage at once, but I don't see how we could
do that either). That would still line up with [~stevel@apache.org]'s point about being a
single round-trip when the parent does exist, and only becoming excessively slow when it doesn't.

In addition to my usual S3N failures, I started seeing the following failures intermittently
today - I would see these perhaps 1-in-4 test runs. So far only with DynamoDB (with and without
authoritative mode), but with it happening rarely I can't be sure it's DynamoDB-specific.
I reverted my change but continue to see the failures:

{code}
testConsistentRenameAfterDelete(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time
elapsed: 11.925 sec  <<< FAILURE!
java.lang.AssertionError: Recently renamed dir should not be visible
        at org.junit.Assert.fail(Assert.java:88)
        at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testConsistentRenameAfterDelete(ITestS3GuardListConsistency.java:237)

testConsistentListAfterDelete(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time
elapsed: 7.383 sec  <<< FAILURE!
java.lang.AssertionError: null
        at org.junit.Assert.fail(Assert.java:86)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at org.junit.Assert.assertFalse(Assert.java:64)
        at org.junit.Assert.assertFalse(Assert.java:74)
        at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testConsistentListAfterDelete(ITestS3GuardListConsistency.java:191)
{code}

> create() does not notify metadataStore of parent directories or ensure they're not existing
files
> -------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14457
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14457
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14457-HADOOP-13345.001.patch, HADOOP-14457-HADOOP-13345.002.patch,
HADOOP-14457-HADOOP-13345.003.patch, HADOOP-14457-HADOOP-13345.004.patch, HADOOP-14457-HADOOP-13345.005.patch,
HADOOP-14457-HADOOP-13345.006.patch, HADOOP-14457-HADOOP-13345.007.patch, HADOOP-14457-HADOOP-13345.008.patch,
HADOOP-14457-HADOOP-13345.009.patch, HADOOP-14457-HADOOP-13345.010.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. LocalMetadataStore
will sometimes erroneously report that a directory is empty with isAuthoritative = true when
it *definitely* has children the metadatastore should know about. It doesn't appear to happen
if the children are just directory. The fact that it's returning an empty listing is concerning,
but the fact that it says it's authoritative *might* be a second bug.
> {code}
> diff --git a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>    }
>  
>    @VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>      return metadataStore;
>    }
>  
> diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws Throwable {
>      boolean rename = fs.rename(srcDir, destDir);
>      assertFalse("s3a doesn't support rename to non-empty directory", rename);
>    }
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +    final FileSystem fs = getFileSystem();
> +    final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +    final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +    try {
> +      fs.mkdirs(parent);
> +      final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +      byte[] srcDataset = dataset(256, 'a', 'z');
> +      writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +          1024, false);
> +
> +      DirListingMetadata list = ms.listChildren(parent);
> +      assertTrue("MetadataStore falsely reports authoritative empty list",
> +          list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +    } finally {
> +      fs.delete(parent, true);
> +    }
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message