hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SammiChen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
Date Thu, 19 Oct 2017 09:40:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210790#comment-16210790

SammiChen commented on HDFS-12682:

Hi [~xiaochen],  thanks for reporting this issue. Inspired by your discovery, I found the
same issue exists in system EC persist into and load from fsImage (HFDS-12686).  The current
convertErasureCodingPolicy function is perfect in most cases. For special cases, like get
all erasure coding policy and persist policy into fsImage, I think we need a new edition for
full convert. 

The problem I see from HDFS-12258's implementation though, is the mutable ECPS is saved on
the immutable ECP, breaking assumptions such as shared single instance policy. At the same
time the policy is still not persisted independently. I think ECPS is highly dependent on
the missing piece from HDFS-7337: policies are not persisted to NN metadata. The state of
whether a policy is enabled could be persisted together with the policy, without impacting
Persist ec policies is implemented in HDFS-7337. 

I think this bug (HDFS-12682) and HDFS-12258 would make more sense if we could first persist
policies to NN metadata. Would also be helpful to separate out something like ErasureCodingPolicyAndState
for the policy-specific APIs, so the state isn't deserialized onto HDFSFileStatus.
For HDFS-12258, [~zhouwei],  [~drankye] and I, we discussed and do have two different approaches
when we first think about how to implement it. One is the current implemented approach, which
add one extra "state" field in the existing ECP definition. Another is define a new class,
something like {{ErasureCodingPolicyWithState}} to hold the EPC and new policy state field.
 They are almost equally good.  The only concern is if we introduce the new {{ErasureCodingPolicyWithState}},
it may introduce complexity to API interfaces, and to end users. There are multiple EC related
APIs.  If we return  {{ErasureCodingPolicyWithState}} for {{getAllErasureCodingPolicies}}
, should we return {{ErasureCodingPolicyWithState}} or {{ErasureCodingPolicy}} for {{getErasureCodingPolicy}}?
something like that. Also is it worth to introduce a new class definition in Hadoop which
only has 1 extra new field?   After all the considerations, the current approach is chosen
to leverage the existing ECP. 

Please let me know if you have other concerns.  Thanks!

> ECAdmin -listPolicies will always show policy state as DISABLED
> ---------------------------------------------------------------
>                 Key: HDFS-12682
>                 URL: https://issues.apache.org/jira/browse/HDFS-12682
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>              Labels: hdfs-ec-3.0-must-do
> On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as DISABLED.
> {noformat}
> [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10,
numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]],
CellSize=1048576, Id=2, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]],
CellSize=1048576, Id=1, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6,
numParityUnits=3]], CellSize=1048576, Id=3, State=DISABLED]
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2,
numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED]
> [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec
> XOR-2-1-1024k
> {noformat}
> This is because when [deserializing protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942],
the static instance of [SystemErasureCodingPolicies class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101]
is first checked, and always returns the cached policy objects, which are created by default
with state=DISABLED.
> All the existing unit tests pass, because that static instance that the client (e.g.
ECAdmin) reads in unit test is updated by NN. :)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message