hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode
Date Tue, 14 Feb 2017 22:47:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864318#comment-15864318
] 

Andrew Wang edited comment on HDFS-7859 at 2/14/17 10:47 PM:
-------------------------------------------------------------

I thought about this JIRA some more, and had two questions I wanted to bring up for discussion:

h3. Do we need a system default EC policy?

AFAICT, the system default policy dates from when we only supported a single policy for HDFS.
Now, we've pretty clearly defined the API for EC policies, and for most uses, the EC policy
is automatically inherited from a dir-level policy. The {{setErasureCodingPolicy}} API already
requires an EC policy to be specified, so I think the default EC policy is basically vestigal
and can be removed.

h3. Can we use configuration instead of persistence for the set of enabled policies?

I'm wondering if there is actually any benefit to persisting the set of allowed policies.
In the past, we've enabled and disabled features via configuration keys, and this is basically
the same idea. There's no danger of data corruption from two NNs having different sets of
enabled policies, so it's safe in that sense. IMO we have a key like {{dfs.namenode.erasure.coding.policies.enabled}}
and specify from the list of hardcoded policies there.

If the above sounds good, I can file a new JIRA for refactoring out the system default policies,
and do the configuration key over on HDFS-11314.


was (Author: andrew.wang):
I thought about this JIRA some more, and had two questions I wanted to bring up for discussion:

h3. Do we need a system default EC policy?

AFAICT, the system default policy dates from when we only supported a single policy for HDFS.
Now, we've pretty clearly defined the API for EC policies, and for most uses, the EC policy
is automatically inherited from a dir-level policy. The {{setErasureCodingPolicy}} API already
requires an EC policy to be specified, so I think the default EC policy is basically vestigal
and can be removed.

# Can we use configuration instead of persistence for the set of enabled policies?

I'm wondering if there is actually any benefit to persisting the set of allowed policies.
In the past, we've enabled and disabled features via configuration keys, and this is basically
the same idea. There's no danger of data corruption from two NNs having different sets of
enabled policies, so it's safe in that sense. IMO we have a key like {{dfs.namenode.erasure.coding.policies.enabled}}
and specify from the list of hardcoded policies there.

If the above sounds good, I can file a new JIRA for refactoring out the system default policies,
and do the configuration key over on HDFS-11314.

> Erasure Coding: Persist erasure coding policies in NameNode
> -----------------------------------------------------------
>
>                 Key: HDFS-7859
>                 URL: https://issues.apache.org/jira/browse/HDFS-7859
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Andrew Wang
>            Priority: Blocker
>              Labels: BB2015-05-TBR, hdfs-ec-3.0-must-do
>         Attachments: HDFS-7859.001.patch, HDFS-7859.002.patch, HDFS-7859.004.patch, HDFS-7859.005.patch,
HDFS-7859.006.patch, HDFS-7859.007.patch, HDFS-7859.008.patch, HDFS-7859.009.patch, HDFS-7859-HDFS-7285.002.patch,
HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas
in NameNode centrally and reliably, so that EC zones can reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message