hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andras Bokor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8690) Shell may remove a file without going to trash even if skipTrash is not enabled
Date Fri, 28 Jul 2017 14:34:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105035#comment-16105035

Andras Bokor commented on HADOOP-8690:


You are right and you are not at the same time.
The first part of your description is correct. If the trash directory cannot be created the
file will be deleted without moving into the trash. Please find my PoC below.
The second part is not true. The code you linked here is a bit misleading. I think your understanding
is that it examines whether the file to delete (from anywhere on the filesystem) has already
been in the trash or not. But in fact, it checks that you try to delete a file from trash.
So the condition is false if you delete /user/knoguchi/abcdef recreate then delete again.
But it will be true and the method will return false if you try to delete /user/knoguchi/.Trash/Current/user/knoguchi/abcdef
file. That is a correct behavior.

My PoC: it has 4 asserts. I assert that the:
* trash is enabled
* rm command success
* file does not exist
* file is not in the trash which means potential dataloss

public void testFileNotDeletedWhenTrashCannotBeCreated()
      throws Exception {
    Configuration conf = new Configuration();
    conf.set(FS_TRASH_INTERVAL_KEY, "10");
    conf.setClass("fs.file.impl", TestLFS.class, FileSystem.class);

    assertTrue(new Trash(conf).isEnabled());

    FileSystem fs = FileSystem.getLocal(conf);

    Path myFolder = new Path(GenericTestUtils.getRandomizedTempPath());
    Path myFile = new Path(myFolder, "myFile");
    writeFile(fs, myFile, 10);

    FsShell shell = new FsShell(conf);

    int val = -1;
    FsPermission origPermission = fs.getFileStatus(TEST_DIR.getParent()).getPermission();
    try {
      fs.setPermission(TEST_DIR.getParent(), new FsPermission("555"));
      assertEquals(shell.run(new String[]{"-rm", myFile.toString()}), 0);
      checkNotInTrash(fs, shell.getCurrentTrashDir(), myFile.toString());
    } catch (Exception e) {
      System.err.println("Exception raised from Trash.run " +
    } finally {
      fs.setPermission(TEST_DIR.getParent(), origPermission);
      fs.delete(myFolder, true);

This code passes but it should fail.
I will upload a patch.

> Shell may remove a file without going to trash even if skipTrash is not enabled
> -------------------------------------------------------------------------------
>                 Key: HADOOP-8690
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8690
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha
>            Reporter: Eli Collins
>            Assignee: Andras Bokor
>            Priority: Minor
> Delete.java contains the following comment:
> {noformat}
> // TODO: if the user wants the trash to be used but there is any
> // problem (ie. creating the trash dir, moving the item to be deleted,
> // etc), then the path will just be deleted because moveToTrash returns
> // false and it falls thru to fs.delete.  this doesn't seem right
> {noformat}
> If Trash#moveToAppropriateTrash returns false FsShell will delete the path even if skipTrash
is not enabled. The comment isn't quite right as some of these failure scenarios result in
exceptions not a false return value, and in the case of an exception we don't unconditionally
delete the path. TrashPolicy#moveToTrash states that it only returns false if the item is
already in the trash or trash is disabled, and the expected behavior for these cases is to
just delete the path. However TrashPolicyDefault#moveToTrash also returns false if there's
a problem creating the trash directory, so for this case I think we should throw an exception
rather than return false (and delete the path bypassing trash).
> I also question the behavior of just deleting when the item is already in the trash as
it may have changed since previously put in the trash and not been checkpointed yet. Seems
like in this case we should move it to trash but with a file name suffix.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message