hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-10263) Reversed snapshot diff report contains incorrect entries
Date Wed, 06 Apr 2016 00:54:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227468#comment-15227468
] 

Jing Zhao edited comment on HDFS-10263 at 4/6/16 12:54 AM:
-----------------------------------------------------------

Thanks for reporting the issue, Yongjun! Actually I do not think the current output is incorrect.
In general, if there is a rename happening on a directory and there are also changes on its
children, only based on the snapshot diff information there is no way to distinguish which
happened first: the rename on the directory or the changes on the children. Currently looks
like we always use the old name of the rename (based on absolute time) to report the children
changes. I understand for HDFS-9820 this is not convenient. So maybe we can use this jira
as an improvement.


was (Author: jingzhao):
Thanks for reporting the issue, Yongjun! Actually I'm not sure if the current output is incorrect
or not. In general, if there is a rename happening on a directory and there are also changes
on its children, only based on the snapshot diff information there is no way to distinguish
which happened first: the rename on the directory or the changes on the children. Currently
looks like we always use the old name of the rename (based on absolute time) to report the
children changes. I understand for HDFS-9820 this is not convenient. So maybe we can use this
jira as an improvement.

> Reversed snapshot diff report contains incorrect entries
> --------------------------------------------------------
>
>                 Key: HDFS-10263
>                 URL: https://issues.apache.org/jira/browse/HDFS-10263
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yongjun Zhang
>
> Steps to reproduce:
> 1. Take a snapshot s1 at:
> {code}
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup       1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/foo
> -rw-r--r--   1 yzhang supergroup       1024 2016-04-05 14:48 /target/foo/f1
> {code}
> 2. Make the following change:
> {code}
>   private int changeData7(Path dir) throws Exception {
>     final Path foo = new Path(dir, "foo");
>     final Path foo2 = new Path(dir, "foo2");
>     final Path foo_f1 = new Path(foo, "f1");
>     final Path foo2_f2 = new Path(foo2, "f2");
>     final Path foo2_f1 = new Path(foo2, "f1");
>     final Path foo_d1 = new Path(foo, "d1");
>     final Path foo_d1_f3 = new Path(foo_d1, "f3");
>     int numDeletedAndModified = 0;
>     dfs.rename(foo, foo2);
>     dfs.delete(foo2_f1, true);
>     
>     DFSTestUtil.createFile(dfs, foo_f1, BLOCK_SIZE, DATA_NUM, 0L);
>     DFSTestUtil.appendFile(dfs, foo_f1, (int) BLOCK_SIZE);
>     dfs.rename(foo_f1, foo2_f2);
>     numDeletedAndModified += 1; // "M ./foo"
>     DFSTestUtil.createFile(dfs, foo_d1_f3, BLOCK_SIZE, DATA_NUM, 0L);
>     return numDeletedAndModified;
>   }
> {code}
> that results in
> {code}
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup       1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/foo
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/foo/d1
> -rw-r--r--   1 yzhang supergroup       1024 2016-04-05 14:48 /target/foo/d1/f3
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/foo2
> -rw-r--r--   1 yzhang supergroup       2048 2016-04-05 14:48 /target/foo2/f2
> {code}
> 3. take snapshot s2 here
> 4. Do the following to revert the change done in step 2
> {code}
>  private int revertChangeData7(Path dir) throws Exception {
>     final Path foo = new Path(dir, "foo");
>     final Path foo2 = new Path(dir, "foo2");
>     final Path foo_f1 = new Path(foo, "f1");
>     final Path foo2_f2 = new Path(foo2, "f2");
>     final Path foo2_f1 = new Path(foo2, "f1");
>     final Path foo_d1 = new Path(foo, "d1");
>     final Path foo_d1_f3 = new Path(foo_d1, "f3");
>     int numDeletedAndModified = 0;
>     
>     dfs.delete(foo_d1, true);
>     dfs.rename(foo2_f2, foo_f1);
>     
>     dfs.delete(foo, true);
>     
>     DFSTestUtil.createFile(dfs, foo2_f1, BLOCK_SIZE, DATA_NUM, 0L);
>     DFSTestUtil.appendFile(dfs, foo2_f1, (int) BLOCK_SIZE);
>     dfs.rename(foo2,  foo);
>     
>     return numDeletedAndModified;
>   }
> {code}
> that get the following results:
> {code}
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/bar
> -rw-r--r--   1 yzhang supergroup       1024 2016-04-05 14:48 /target/bar/f1
> drwxr-xr-x   - yzhang supergroup          0 2016-04-05 14:48 /target/foo
> -rw-r--r--   1 yzhang supergroup       2048 2016-04-05 14:48 /target/foo/f1
> {code}
> 4. Take snapshot s3 here.
> Below is the different snapshots
> {code}
> s1-s2: Difference between snapshot s1 and snapshot s2 under directory /target:
> M	.
> +	./foo
> R	./foo -> ./foo2
> M	./foo
> +	./foo/f2
> -	./foo/f1
> s2-s1: Difference between snapshot s2 and snapshot s1 under directory /target:
> M	.
> -	./foo
> R	./foo2 -> ./foo
> M	./foo
> -	./foo/f2
> +	./foo/f1
> s2-s3: Difference between snapshot s2 and snapshot s3 under directory /target:
> M	.
> -	./foo
> R	./foo2 -> ./foo
> M	./foo2
> +	./foo2/f1
> -	./foo2/f2
> s3-s2: Difference between snapshot s3 and snapshot s2 under directory /target:
> M	.
> +	./foo
> R	./foo -> ./foo2
> M	./foo2
> -	./foo2/f1
> +	./foo2/f2
> {code}
> The s2-s1 snapshot is supposed to be the same as s2-s3, because  the change from s2 to
s3 is an exact reversion of the change from s1 to s2.  We can see that s1 and s3 have same
file structures.
> However, the resulted shown above is not. I expect the following part
> {code}
> M	./foo
> -	./foo/f2
> +	./foo/f1
> {code}
> in s2-s1 diff should be 
> {code}
> M	./foo2
> +	./foo2/f1
> -	./foo2/f2
> {code}
> (same as in s2-s3)
> instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message