hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13308) S3A delete and rename may fail to preserve parent directory.
Date Wed, 22 Jun 2016 16:01:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15344599#comment-15344599

Chris Nauroth commented on HADOOP-13308:

bq. you could actually do the PUT without doing the check, couldn't you?

The potential side effect here would be an erroneous count on the "created directories" metric.
 Alternatively, maybe we say that fake directory creation doesn't count towards that metric
at all, because it wasn't logically creating a new directory, even though the implementation
needed to mutate something in the bucket.  If only S3 offered an atomic put-if-not-exists

bq. The biggest problem with having spurious parent dirs is that delete/ itself may do some
checks for empty directories and handle them.

Yes, this would need to be though through carefully and checked for any repair requirements.

bq. Maybe we need to think about some s3 fsck routine which does cleanup.

I just filed HADOOP-13311 to discuss some ideas I have about possibly adding a new {{s3a}}
shell entry point.

> S3A delete and rename may fail to preserve parent directory.
> ------------------------------------------------------------
>                 Key: HADOOP-13308
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13308
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Priority: Minor
> When a file or directory is deleted or renamed in S3A, and the result of that operation
makes the parent empty, S3A must store a fake directory (a pure metadata object) at the parent
to indicate that the directory still exists.  The logic for restoring fake directories is
not resilient to a process death.  This may cause a directory to vanish unexpectedly after
a deletion or rename of its last child.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message