hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuanbo Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-10756) Expose getTrashRoot to HTTPFS and WebHDFS
Date Tue, 30 Aug 2016 09:15:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448520#comment-15448520
] 

Yuanbo Liu edited comment on HDFS-10756 at 8/30/16 9:14 AM:
------------------------------------------------------------

[~jojochuang] / [~xiaochen] Sorry for the late response, I'm quite busy last week.

1. From Xiaochen
{quote}
I don't have a strong feeling regarding splitting the jira.
{quote}
I thought this patch might be too large to review. I'm OK if we use a single jira to track
it.

{quote}
In both implementations'......
{quote}
Good catch! I've changed my code in V2 patch. One thing I should high line is that I didn't
write EZ test case in {{BaseTestHttpFSWith}}. I've reviewed the code in {{BaseTestHttpFSWith}},
it would be tricky if I add EZ test case here. So I added such kind of code in {{TestHttpFSServer}}.

I also change the code in {{DistributedFileSystem#getTrashRoot}}, here is the change
{code}
-    if ((path == null) || path.isRoot() || !dfs.isHDFSEncryptionEnabled()) {
+    if ((path == null) || path.isRoot()) {
       return super.getTrashRoot(path);
     }
-    String parentSrc = path.getParent().toUri().getPath();
+    String pathSrc = path.toUri().getPath();
{code}
Because the configuration of client may not be up-to-date, it's not accurate to use client's
{{isHDFSEncryptionEnabled}} to detect whether EZ is enabled.
Also using the parent path to get EZ for the path is not right in some cases. For example,
if the EZ path is "/zone", the parent path "/" is a normal path, {{getTrashRoot}} returns
wrong info about "/zone".

2. From Weichiu
{quota}
when logging an exception..
{quota}
Good suggestions! I think my v2 patch has addressed them.

{quote}
it is important to note that the application should not assume the trash directory returned
exists
{quote}
First I thought returning null in {{getTrashRoot}} may help to remind users that the trash
path may not exist. But Xiaochen reminded me that returning null is not the same behavior
in DFS's getTrashRoot. So I wrote some test cases personally and found that
1. If the input path is a normal path, the result of trash root is related to the user who
invoke {{getTrashRoot}}
2. If the input path is a EZ path, the result of trash root is related to the EZ path and
the user.
3. If exception happens int item 2), it falls back to the default implement aka item 1) and
returns "/user/$USER/.Trash"
4. The result of trash root is not necessarily existing in HDFS.
In case of the behaviors of DFS,  I didn't write code to inform users that the trash path
may not exist.

Thanks for your time and I hope to get your thoughts.



was (Author: yuanbo):
[~jojochuang] / [~xiaochen] Sorry for the late response, I'm quite busy last week.

1. From Xiaochen
{quote}
I don't have a strong feeling regarding splitting the jira.
{quote}
I thought this patch might be too large to review. I'm OK if we use a single jira to track
it.

{quote}
In both implementations'......
{quote}
Good catch! I've changed my code in V2 patch. One thing I should high line is that I didn't
write EZ test case in {{BaseTestHttpFSWith}}. I've reviewed the code in {{BaseTestHttpFSWith}},
it would be tricky if I add EZ test case here. So I added such kind of code in {{TestHttpFSServer}}.

I also change the code in {{DistributedFileSystem#getTrashRoot}}, here is the change
{code}
-    if ((path == null) || path.isRoot() || !dfs.isHDFSEncryptionEnabled()) {
+    if ((path == null) || path.isRoot()) {
       return super.getTrashRoot(path);
     }
-    String parentSrc = path.getParent().toUri().getPath();
+    String pathSrc = path.toUri().getPath();
{code}
Because the configuration of client may not be up-to-date, it's not accurate to use client's
{{isHDFSEncryptionEnabled}} to detect whether EZ is enabled.
Also using the parent path to get EZ for the path is not right in some cases. For example,
if the EZ path is "/zone", the parent path "/" is a normal path, {{getTrashRoot}} returns
wrong info about "/zone".

2. From Weichiu
{quota}
when logging an exception..
{quota}
Good suggestions! I think my v2 patch has addressed them.

{quota}
it is important to note that the application should not assume the trash directory returned
exists
{quota}
First I thought returning null in {{getTrashRoot}} may help to remind users that the trash
path may not exist. But Xiaochen reminded me that returning null is not the same behavior
in DFS's getTrashRoot. So I wrote some test cases personally and found that
1. If the input path is a normal path, the result of trash root is related to the user who
invoke {{getTrashRoot}}
2. If the input path is a EZ path, the result of trash root is related to the EZ path and
the user.
3. If exception happens int item 2), it falls back to the default implement aka item 1) and
returns "/user/$USER/.Trash"
4. The result of trash root is not necessarily existing in HDFS.
In case of the behaviors of DFS,  I didn't write code to inform users that the trash path
may not exist.

Thanks for your time and I hope to get your thoughts.


> Expose getTrashRoot to HTTPFS and WebHDFS
> -----------------------------------------
>
>                 Key: HDFS-10756
>                 URL: https://issues.apache.org/jira/browse/HDFS-10756
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: encryption, httpfs, webhdfs
>            Reporter: Xiao Chen
>            Assignee: Yuanbo Liu
>         Attachments: HDFS-10756.001.patch, HDFS-10756.002.patch
>
>
> Currently, hadoop FileSystem API has [getTrashRoot|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2708]
to determine trash directory at run time. Default trash dir is under {{/user/$USER}}
> For an encrypted file, since moving files between/in/out of EZs are not allowed, when
an EZ file is deleted via CLI, it calls in to [DFS implementation|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java#L2485]
to move the file to a trash directory under the same EZ.
> This works perfectly fine for CLI users or java users who call FileSystem API. But for
users via httpfs/webhdfs, currently there is no way to figure out what the trash root would
be. This jira is proposing we add such interface to httpfs and webhdfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message