hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13760) S3Guard: add delete tracking
Date Fri, 19 May 2017 13:04:04 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Mackrory updated HADOOP-13760:
-----------------------------------
    Attachment: HADOOP-13760-HADOOP-13345.008.patch

So I really don't see a way to refactor tombstone logic out of s3GetFileStatus completely
without making it really messy, but I did go with 2 isEmpty functions as you suggested and
that does minimize the diff at least.

I also noticed while reviewing myself that while listFiles did have logic to not return tombstones,
it did NOT have logic to actually reconcile the S3 results against those tombstones. So I
modified the LeafNodesIterator (now called MetadataStoreListFilesIterator to reflect its more
specialized purpose) to provide access to both leaf nodes and tombstones, and another Iterator
that takes the combined results and filters out recently deleted things. Also added another
test that checks rename works correctly when it operates on recently deleted things (that
fails without my latest changes).

I have one TODO - I think we can expand what the MetadataStoreListFilesIterator will return
if we encounter directories that we think are empty and are authoritative. But right now this
has good test coverage, passes all tests with all 3 implementations, and is feeling pretty
good to me.

> S3Guard: add delete tracking
> ----------------------------
>
>                 Key: HADOOP-13760
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13760
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-13760-HADOOP-13345.001.patch, HADOOP-13760-HADOOP-13345.002.patch,
HADOOP-13760-HADOOP-13345.003.patch, HADOOP-13760-HADOOP-13345.004.patch, HADOOP-13760-HADOOP-13345.005.patch,
HADOOP-13760-HADOOP-13345.006.patch, HADOOP-13760-HADOOP-13345.007.patch, HADOOP-13760-HADOOP-13345.008.patch
>
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add delete
tracking.
> Current behavior on delete is to remove the metadata from the MetadataStore.  To make
deletes consistent, we need to add a {{isDeleted}} flag to {{PathMetadata}} and check it when
returning results from functions like {{getFileStatus()}} and {{listStatus()}}.  In HADOOP-13651,
I added TODO comments in most of the places these new conditions are needed.  The work does
not look too bad.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message