hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mingliang Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling
Date Sat, 04 Mar 2017 02:19:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895426#comment-15895426

Mingliang Liu commented on HADOOP-13914:

Thanks [~fabbri], this is really good improvement. I like the design & patch. Glad we
stay with the "improved approach" from v0 wip patch (aka _Treat isEmptyDirectory as ephemeral_
in the initial doc).

Some comments:
# In {{DynamoDBMetadataStore#get()}}, we should check {{wantEmptyDirectoryFlag}} value. One
candidate place is L337 {{if (meta != null) {}}
# {code:title=DirListingMetadata.java}
115	  /**
116	   * Is the underlying directory known to be empty?
117	   * @return FALSE if directory is known to have a child entry, TRUE if
118	   * directory is known to be empty, UNKNOWN otherwise.
119	   */
120	  public Tristate isEmpty() {
121	    if (getListing().isEmpty()) {
122	      if (isAuthoritative()) {
123	        return Tristate.TRUE;
124	      } else {
125	        // This listing is empty, but may not be full list of underlying dir.
126	        return Tristate.UNKNOWN;
127	      }
128	    } else { // not empty listing
129	      // There exists at least one child, dir not empty.
130	      return Tristate.FALSE;
131	    }
132	  }
The {{isAuthoritative}} is false and we have children, we confidently trust the child's existence
though it's not authoritative (fully cached). Seems reasonable.
# {code:title=DynamoDBMetadataStore}
325	        meta = new PathMetadata(new FileStatus(0, true, 1, 0, 0, 0, null,
326	            username, null, path));
a util makeFileStatus()...? I have a gut feeling - we'll reuse this.
# {code:title= MetadataStoreTestBase#testGetEmptyDir()}
368	    // 3. Check that either (a) the MS doesn't track whether or not it is
369	    // empty (which is allowed), or (b) the MS knows the dir is empty.
370	    if (!allowMissing() || meta != null) {
371	      assertNotNull("Get found dir", meta);
372	      assertNotEquals("Dir is empty or unknown", Tristate.FALSE,
373	          meta.isEmptyDirectory());
374	    }
To simplify the if clause as {{if (!allowMissing()) {}}. I'd prefer error message be something
clearer: "Get should find meta for path" / "Dir should be empty or unknown" Anyway it's nit.
Other test methods as well.

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> --------------------------------------------------------
>                 Key: HADOOP-13914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13914
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13914-HADOOP-13345.000.patch, HADOOP-13914-HADOOP-13345.002.patch,
HADOOP-13914-HADOOP-13345.003.patch, HADOOP-13914-HADOOP-13345.004.patch, HADOOP-13914-HADOOP-13345.005.patch,
s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag stored in
S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB implementation,
and also sacrifices good code separation to minimize S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and preferred
solution.  I suggest we do this work after merging the HADOOP-13345 branch to trunk, but am
open to suggestions.
> I can also attach a patch of a integration test that exercises the missing case and demonstrates
a failure with DynamoDBMetadataStore.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message