hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12682) ECAdmin -listPolicies will always show SystemErasureCodingPolicies state as DISABLED
Date Mon, 30 Oct 2017 21:46:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Xiao Chen updated HDFS-12682:
    Attachment: HDFS-12682.04.patch

Attaching patch 4 which is a hybrid solution between the approaches in patch 2 and patch 3.

To summarize the concerns, they're mostly:
# ECP should be immutable
# Compatibility
# Code easy to maintain / read

In patch 4, protobuf is untouched, and the Java class {{ErasureCodingPolicy}} is modified
to represent the immutable version of a policy. A new Java class {{ErasureCodingPoilcyInfo}}
is added, containing the ECP and its state.

When serializing and deserializing the protobuf, we conditionally convert it to a ECP or ECPI,
based on necessity. The ECPs are still cached by {{SystemErasureCodingPolicies}} regardlessly.

There is a slight incompatibility of {{LimitedPrivate}} {{DistributedFileSystem}} and {{Public}}
{{Evolving}} {{HdfsAdmin}}, but this feels to be best solution to me.

Also addressed Eddy's comment about making the ECP classes private, and Rakesh's comment about
removing the setters of id/name from ECP.

Regarding Kai's comment of {{we should unify all the EC policies}}, I think we can do a separate
jira to basically change {{SystemErasureCodingPolicies}} to be {{ErasureCodingPoliciesCache}}
or the like. IIUC this can be a separate improvement jira.

Thanks all for the reviews!

> ECAdmin -listPolicies will always show SystemErasureCodingPolicies state as DISABLED
> ------------------------------------------------------------------------------------
>                 Key: HDFS-12682
>                 URL: https://issues.apache.org/jira/browse/HDFS-12682
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>            Reporter: Xiao Chen
>            Assignee: Xiao Chen
>            Priority: Blocker
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-12682.01.patch, HDFS-12682.02.patch, HDFS-12682.03.patch, HDFS-12682.04.patch
> On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as DISABLED.
> {noformat}
> [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10,
numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]],
CellSize=1048576, Id=2, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]],
CellSize=1048576, Id=1, State=DISABLED]
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6,
numParityUnits=3]], CellSize=1048576, Id=3, State=DISABLED]
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2,
numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED]
> [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec
> XOR-2-1-1024k
> {noformat}
> This is because when [deserializing protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942],
the static instance of [SystemErasureCodingPolicies class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101]
is first checked, and always returns the cached policy objects, which are created by default
with state=DISABLED.
> All the existing unit tests pass, because that static instance that the client (e.g.
ECAdmin) reads in unit test is updated by NN. :)

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message