hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6165) "hdfs dfs -rm -r" and "hdfs -rmdir" commands can't remove empty directory
Date Tue, 29 Apr 2014 04:35:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983980#comment-13983980

Yongjun Zhang commented on HDFS-6165:


Thanks a lot for your earlier comments, and thanks Andrew a lot for the detailed review!

 I just updated patch version 005 to address all.

For rmdir, it's the solution I described in above;
For rmr solution, I actually did
void checkPermission(String path, INodeDirectory root, boolean doCheckOwner,
      FsAction ancestorAccess, FsAction parentAccess, FsAction access,
      FsAction subAccess, boolean ignoreEmptyDir, boolean resolveLink)
The two parameters "subAccess" and "ignoreEmptyDir" work together, 
- if subAccess is not NULL, access permission of subDirs are checked, 
- when subAccess is checked, if ignoreEmptyDir is true, ignore
  empty directories.

To address Andrew's comments
I think the semantics for a recursive delete via DistributedFileSystem#delete are still not
quite right. The change you made will work for the shell since it does its own recursion,
but we need to do the same "remove if empty dir with read" when recursing via recursive =
true too. You might be able to do this by modifying FSPermissionChecker#checkSubAccess appropriately,
but a new flag or new code would be safer.
Thanks a lot for pointing this out, indeed it's a problem there. See above described solution,
except we agreed that we don't need to check permission for empty dir.

    isDirectory, can we add per-parameter javadoc rather than stacking on the @return? I think
renaming empty to isEmpty would also help.
    Nit, also need a space in the ternary empty? and dir.isEmptyDirectory(src)?.
These are now gone with new solution.

    In Delete, I think it's a bit cleaner to do an instanceof PathIsNotEmptyDirectoryException.class
check instead.
This is handled in a better way now. I discovered a bug HADOOP-10543 (and posted a patch)
when looking at this. With HADOOP-10543 committed, I would be able to do exactly what Andrew
suggested. But I think what I have in this new revision should fine too.
    Some lines longer than 80 chars
Hopefully all addressed:-)

I gave this a quick pass, but overall it may be better to rewrite these to use the DFS API
instead of the shell. We need to test recursive delete, which the shell doesn't do, and we
don't really have any shell changes in the latest rev, which lessens the importance of having
new shell tests.
I think adding a test infra like what I added give another  option here, hopefully the new
revision looks better:-)

    execCmd needs to do some try/finally to close and restore the streams if there's an exception.
Also an extra commented line there.
This FsShell actually took care of catching exception, so the stream will not get lost. Extra
comment line removed.

    Could we rename this file to "TestFsShellPermission"? Permission is a more standard term.

    This file also should not be in hadoop-tools, but rather hadoop-common.
Because it uses MiniDFSCluster, it can not be in hadoop-common. But I moved to hdfs test area

    This does a lot of starting and stopping of a MiniCluster for running single-line tests.
Can we combine these into a single test? We also don't need any DNs for this cluster, since
we're just testing perms.
I refactored the code to take care of this. Since we create file, I still keep DNs.

    We have FileSystemTestHelper#createFile for creating files, can save some code
    Use of @Before and @After blocks might also clarify what's going on.
    This also should be a JUnit4 test with @Test annotations, not JUnit3.
    USER_UGI should not be all caps, it's not static final
    It's a bit ugly how we pass UNEXPECTED_RESULT in for a lot of tests. Can we just pass
a boolean for expectSuccess or expectFailure, or maybe a String that we can call assertExceptionContains
All are taken care of, except I forgot @before and @After, but hoopefully it looks much better

    FileEntry looks basically like a FileStatus, can we just use that instead?
FileEntry only have the fields needed for this test, and it's easier to manage in test area.
I'm worried using FileStatus would be not easy to control. So I didn't do this. Hope it's

Thanks in advance for a further review.

> "hdfs dfs -rm -r" and "hdfs -rmdir" commands can't remove empty directory 
> --------------------------------------------------------------------------
>                 Key: HDFS-6165
>                 URL: https://issues.apache.org/jira/browse/HDFS-6165
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.3.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>            Priority: Minor
>         Attachments: HDFS-6165.001.patch, HDFS-6165.002.patch, HDFS-6165.003.patch, HDFS-6165.004.patch,
HDFS-6165.004.patch, HDFS-6165.005.patch
> Given a directory owned by user A with WRITE permission containing an empty directory
owned by user B, it is not possible to delete user B's empty directory with either "hdfs dfs
-rm -r" or "hdfs dfs -rmdir". Because the current implementation requires FULL permission
of the empty directory, and throws exception. 
> On the other hand, on linux, "rm -r" and "rmdir" command can remove empty directory as
long as the parent directory has WRITE permission (and prefix component of the path have EXECUTE
permission), For the tested OSes, some prompt user asking for confirmation, some don't.
> Here's a reproduction:
> {code}
> [root@vm01 ~]# hdfs dfs -ls /user/
> Found 4 items
> drwxr-xr-x   - userabc users               0 2013-05-03 01:55 /user/userabc
> drwxr-xr-x   - hdfs    supergroup          0 2013-05-03 00:28 /user/hdfs
> drwxrwxrwx   - mapred  hadoop              0 2013-05-03 00:13 /user/history
> drwxr-xr-x   - hdfs    supergroup          0 2013-04-14 16:46 /user/hive
> [root@vm01 ~]# hdfs dfs -ls /user/userabc
> Found 8 items
> drwx------   - userabc users          0 2013-05-02 17:00 /user/userabc/.Trash
> drwxr-xr-x   - userabc users          0 2013-05-03 01:34 /user/userabc/.cm
> drwx------   - userabc users          0 2013-05-03 01:06 /user/userabc/.staging
> drwxr-xr-x   - userabc users          0 2013-04-14 18:31 /user/userabc/apps
> drwxr-xr-x   - userabc users          0 2013-04-30 18:05 /user/userabc/ds
> drwxr-xr-x   - hdfs    users          0 2013-05-03 01:54 /user/userabc/foo
> drwxr-xr-x   - userabc users          0 2013-04-30 16:18 /user/userabc/maven_source
> drwxr-xr-x   - hdfs    users          0 2013-05-03 01:40 /user/userabc/test-restore
> [root@vm01 ~]# hdfs dfs -ls /user/userabc/foo/
> [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -r -skipTrash /user/userabc/foo
> rm: Permission denied: user=userabc, access=ALL, inode="/user/userabc/foo":hdfs:users:drwxr-xr-x
> {code}
> The super user can delete the directory.
> {code}
> [root@vm01 ~]# sudo -u hdfs hdfs dfs -rm -r -skipTrash /user/userabc/foo
> Deleted /user/userabc/foo
> {code}
> The same is not true for files, however. They have the correct behavior.
> {code}
> [root@vm01 ~]# sudo -u hdfs hdfs dfs -touchz /user/userabc/foo-file
> [root@vm01 ~]# hdfs dfs -ls /user/userabc/
> Found 8 items
> drwx------   - userabc users          0 2013-05-02 17:00 /user/userabc/.Trash
> drwxr-xr-x   - userabc users          0 2013-05-03 01:34 /user/userabc/.cm
> drwx------   - userabc users          0 2013-05-03 01:06 /user/userabc/.staging
> drwxr-xr-x   - userabc users          0 2013-04-14 18:31 /user/userabc/apps
> drwxr-xr-x   - userabc users          0 2013-04-30 18:05 /user/userabc/ds
> -rw-r--r--   1 hdfs    users          0 2013-05-03 02:11 /user/userabc/foo-file
> drwxr-xr-x   - userabc users          0 2013-04-30 16:18 /user/userabc/maven_source
> drwxr-xr-x   - hdfs    users          0 2013-05-03 01:40 /user/userabc/test-restore
> [root@vm01 ~]# sudo -u userabc hdfs dfs -rm -skipTrash /user/userabc/foo-file
> Deleted /user/userabc/foo-file
> {code}
> Using "hdfs dfs -rmdir" command:
> {code}
> bash-4.1$ hadoop fs -lsr /
> lsr: DEPRECATED: Please use 'ls -R' instead.
> drwxr-xr-x   - hdfs supergroup          0 2014-03-25 16:29 /user
> drwxr-xr-x   - hdfs   supergroup          0 2014-03-25 16:28 /user/hdfs
> drwxr-xr-x   - usrabc users               0 2014-03-28 23:39 /user/usrabc
> drwxr-xr-x   - abc    abc                 0 2014-03-28 23:39 /user/usrabc/foo-empty1
> [root@vm01 usrabc]# su usrabc
> [usrabc@vm01 ~]$ hdfs dfs -rmdir /user/usrabc/foo-empty1
> rmdir: Permission denied: user=usrabc, access=ALL, inode="/user/usrabc/foo-empty1":abc:abc:drwxr-xr-x
> {code}

This message was sent by Atlassian JIRA

View raw message