hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SammiChen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11072) Add ability to unset and change directory EC policy
Date Wed, 07 Dec 2016 12:12:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15728618#comment-15728618
] 

SammiChen commented on HDFS-11072:
----------------------------------

Andrew, thanks very much for taking time review the patch!
bq. Can we just say "replication" rather than "continuous replicate"? e.g. "getReplicationPolicy"
instead of "getContinuousReplicatePolicy" "continuous replicate" is chosen because I thought
there is the combination of "replication" plus "erasure coding", the planed phase 2 of erasure
coding. So I'm use "continuous replicate" to distinguish future "erasure coding replicate".
Does it make sense? 

bq. Note that setting a "replication" EC policy is still different from unsetting. Unsetting
means the policy will be inherited from an ancestor. Setting a "replication" policy means
the "replication" policy will be used. Imagine a situation where there are "/a" has RS 6,3
set and "/a/b" has XOR 2,1 set. On "/a/b", unsetting vs. setting "replication" will have different
effects. So we also need an unset API, similar to the unset storage policy API.

I agree with you and the implementation matches your thoughts. And I will add a new unset
API. 

bq. Do the parameters "1-2-64K" have any meaning? If not, we should explain that they are
meaningless, or hide the parameters so we don't need to talk about them.

"1-2-64K" is auto generated from the schema when replicate policy is defined. The data is
meaningless. At the first, I use the "null" as schema to define the policy, then I found there
is checker about schema can't be null. And then I use schema (0-0-0). It breaks other checkers.
I think we would like to keep these checkers to avoid mistakes made by real ec policy, so
at the end, I choose "1-2-64k", which means 1 data block, 2 parity blocks, kind of matching
the default 3 replication case.  As Rakesh has suggested to add a new unset API and a new
unset policy sub command in "erasurecode", makes the replicate policy internal. So user will
not see the policy unless they read the source code.  

I will take care of all other comments in the new patch. 



 


> Add ability to unset and change directory EC policy
> ---------------------------------------------------
>
>                 Key: HDFS-11072
>                 URL: https://issues.apache.org/jira/browse/HDFS-11072
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Andrew Wang
>            Assignee: SammiChen
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-11072-v1.patch, HDFS-11072-v2.patch, HDFS-11072-v3.patch, HDFS-11072-v4.patch
>
>
> Since the directory-level EC policy simply applies to files at create time, it makes
sense to make it more similar to storage policies and allow changing and unsetting the policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message