hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-14172) S3Guard: import does not import empty directory
Date Tue, 14 Mar 2017 01:24:41 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Mackrory updated HADOOP-14172:
    Attachment: HADOOP-14172-HADOOP-13345.001.patch

Patch attached. Ran all S3 tests with and without -Ddynamo -Ds3guard, and confirmed that the
modified test passed with my changes but failed without them. Went with the approach of (internally)
allowing a customer Acceptor to be passed in, and implementing a new listFilesAndDirectories
function that uses it.

> S3Guard: import does not import empty directory
> -----------------------------------------------
>                 Key: HADOOP-14172
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14172
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14172-HADOOP-13345.001.patch
> It imports everything comes up from listFiles, which includes only files (and their parent
directories as a side-effect). My first thought on doing this would be to override S3AFileSystem
to add an optional parameter to use AcceptAllButSelfAndS3nDirs instead of AcceptFilesOnly.
But we could also manually traverse the tree to get all FileStatus objects directory by directory
like we do for diff. That's far slower but doesn't add surface area to S3AFileSystem. But
there's also the impact to other S3 clients to worry about - I could go either way on that.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message