hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12358) FSShell should prompt before deleting directories bigger than a configured size
Date Thu, 27 Aug 2015 20:34:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717481#comment-14717481

Arpit Agarwal commented on HADOOP-12358:

There are three concerns from the Jira.
# Compatibility. The checks are off by default.
# {{getContentSummary}} requires too many RPCs for filesystems other then DFS.
# Configuration complexity.
** We can get rid of the boolean setting, e.g. just disable the check if the thresholds are
zero or negative. If we also get rid of the size-based threshold we only need one new setting
for the inode count threshold.
# getContentSummary is expensive. This is a valid concern.

Does it make sense to move this check to the NN? NN already does a recursive permissions check
for every delete call ({{FsPermissionChecker#checkSubAccess}}). A suggested approach:
# Add a {{FileSystem#delete}} overload that takes a threshold. 
# Extend the recursive permissions check to compute the number of descendant inodes. It is
a little ugly but avoids recursing twice. We can skip the file size check.
# If the computed inode count is below the threshold the dir is deleted, else the call fails.
# If the call fails the shell command throws your prompt. If the user chooses Y, invoke the
regular delete call.
# If the underlying filesystem does not support checking the threshold then it just performs
a regular delete. This takes care of the first concern above.

This still has the potential to break automation when the feature is enabled so we can make
the default behavior to simply fail the delete call. An additional parameter can allow prompting
to override the checks. 

> FSShell should prompt before deleting directories bigger than a configured size
> -------------------------------------------------------------------------------
>                 Key: HADOOP-12358
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12358
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: HADOOP-12358.00.patch, HADOOP-12358.01.patch, HADOOP-12358.02.patch,
> We have seen many cases with customers deleting data inadvertently with -skipTrash. The
FSShell should prompt user if the size of the data or the number of files being deleted is
bigger than a threshold even though -skipTrash is being used.

This message was sent by Atlassian JIRA

View raw message