hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitri Chmelev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-16090) deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a versioned S3 bucket
Date Mon, 04 Feb 2019 19:50:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760157#comment-16760157
] 

Dmitri Chmelev commented on HADOOP-16090:
-----------------------------------------

A slight optimization of #1 is to avoid calling full-blown getFileStatus() and instead just
issue a head request per path component, making sure that the trailing slash is present. Effectively,
this probes for existence of the fake directory, without using list-objects. This, however,
still keeps the cleanup at O(depth) and we effectively trade write amplification for read
amplification. Currently, that's the patch that I was going to propose, predicated on addition
of "versioned.store" flag.

I am not sure #3 works if we limit the search to immediate parent directory. The problem is
"mkdir -p" and copyFromLocalFile(). Either can shadow an existing empty directory present
anywhere along the destination path. One idea I had was to bail out searching as soon as
the candidate path component is confirmed to have two 'non-fake' children. However, this is
not ideal for two reasons: 1) race conditions when multiple clients create objects in the
same directory, defeating the check 2) pathological case where every path component along
the path has exactly one child (likely not common, assuming expected branching factor of >
1). As far as #1 goes, I am actually concerned in general that handling of fake directories
today is racy and could lead to inconsistencies when multiple writers are involved (fake
directories incorrectly created or removed, breaking lookup).

Regarding innerMkdirs() not deleting fake dirs, I believe this was fixed in HADOOP-14255.
I intend to cherrypick this change as I ran into the same conclusion while reading the code.

As far as HADOOP-13421, it was already on my radar and I was curious whether it could be
backported easily to 2.8.x. Thanks for the heads up about the SDK version. I believe it does
not solve the underlying problem of delete marker accumulation. It could also be mitigated
by adding life-cycle policy rules to perform cleanup.

> deleteUnnecessaryFakeDirectories() creates unnecessary delete markers in a versioned
S3 bucket
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16090
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.1
>            Reporter: Dmitri Chmelev
>            Priority: Minor
>
> The fix to avoid calls to getFileStatus() for each path component in deleteUnnecessaryFakeDirectories()
(HADOOP-13164) results in accumulation of delete markers in versioned S3 buckets. The above
patch replaced getFileStatus() checks with a single batch delete request formed by generating
all ancestor keys formed from a given path. Since the delete request is not checking for existence
of fake directories, it will create a delete marker for every path component that did not
exist (or was previously deleted). Note that issuing a DELETE request without specifying
a version ID will always create a new delete marker, even if one already exists ([AWS S3
Developer Guide|https://docs.aws.amazon.com/AmazonS3/latest/dev/RemDelMarker.html])
> Since deleteUnnecessaryFakeDirectories() is called as a callback on successful writes
and on renames, delete markers accumulate rather quickly and their rate of accumulation is
inversely proportional to the depth of the path. In other words, directories closer to the
root will have more delete markers than the leaves.
> This behavior negatively impacts performance of getFileStatus() operation when it has
to issue listObjects() request (especially v1) as the delete markers have to be examined when
the request searches for first current non-deleted version of an object following a given
prefix.
> I did a quick comparison against 3.x and the issue is still present: [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2947]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message