hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling
Date Fri, 06 Jan 2017 21:29:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15805814#comment-15805814
] 

Steve Loughran commented on HADOOP-13914:
-----------------------------------------

Good writeup.

I had a talk with mingliang & rajesh about this.

We only want that dir as an optimisation of followon work in s3aFS, so that if you get a delete(path)
you can do a getFileStatus, and, if status=directory, see if it is empty (so skip the need
for recursive=true) without another round trip.

with s3guard you don't need that caching of state. It can be be done on demand, only in those
few cases where we actually need to know about it...which pushes for it being something that
the metadatastore can work out on demand. We would need to document that the status field
is only valid without an MD store




> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> --------------------------------------------------------
>
>                 Key: HADOOP-13914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13914
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Mingliang Liu
>         Attachments: s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
>
>
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag stored in
S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB implementation,
and also sacrifices good code separation to minimize S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and preferred
solution.  I suggest we do this work after merging the HADOOP-13345 branch to trunk, but am
open to suggestions.
> I can also attach a patch of a integration test that exercises the missing case and demonstrates
a failure with DynamoDBMetadataStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message