hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode
Date Wed, 01 Feb 2017 01:28:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847858#comment-15847858

Andrew Wang commented on HDFS-7859:

I re-read through the history of this JIRA, and it seems like we've debated a couple times
whether it's useful to persist this information, with the current status of the patch being
to only persist user-added policies.

I think this is not very useful as is, and potentially dangerous. We let the user specify
any ECPolicy they want, without much field validation. This means the user could specify an
ID/name that we already use, or an ID/name we might want to hardcode later. Even with validation,
this makes upgrade difficult.

Given that we haven't finished the pluggable EC policy work, we also don't know what fields
might be required to fully specify an EC policy. This patch does let the user configure different
parameters for Reed Solomon, but we already provide what we think are a good set of hardcoded
policies to choose from.

IMO where some persistence would be useful is for HDFS-11314. We'd like to restrict the set
of EC policies that can be used on a cluster, since fault tolerance depends on the # of nodes
and racks. This would be limiting from the set of hardcoded policies though, rather than adding
new policies.

[~drankye], [~zhz], thoughts on this?

> Erasure Coding: Persist erasure coding policies in NameNode
> -----------------------------------------------------------
>                 Key: HDFS-7859
>                 URL: https://issues.apache.org/jira/browse/HDFS-7859
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Xinwei Qin 
>            Priority: Blocker
>              Labels: BB2015-05-TBR, hdfs-ec-3.0-must-do
>         Attachments: HDFS-7859.001.patch, HDFS-7859.002.patch, HDFS-7859.004.patch, HDFS-7859.005.patch,
HDFS-7859.006.patch, HDFS-7859.007.patch, HDFS-7859.008.patch, HDFS-7859.009.patch, HDFS-7859-HDFS-7285.002.patch,
HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas
in NameNode centrally and reliably, so that EC zones can reference them by name efficiently.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message