hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12918) EC Policy defaults incorrectly to enabled in protobufs
Date Tue, 12 Dec 2017 22:22:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288378#comment-16288378
] 

Manoj Govindassamy commented on HDFS-12918:
-------------------------------------------

[~zamsden],
  There is an addendum patch HDFS-12682 after HDFS-12258 to make the policy immutable by pulling
the EC state to {{ErasureCodingPolicyInfo}}. As you pointed out {{hdfs.proto}} default value
looks wrong to me as well. But, in the PBHelperClient code there is an explicit handling for
this, both while saving the ECPolicy and while retrieving.  So, ECPI saved and retrieved from
FSImages should be right. 

{{PBHelperClient}}
{noformat}
  /**
   * Convert the protobuf to a {@link ErasureCodingPolicyInfo}. This should only
   * be needed when the caller is interested in the state of the policy.
   */
  public static ErasureCodingPolicyInfo convertErasureCodingPolicyInfo(
      ErasureCodingPolicyProto proto) {
    ErasureCodingPolicy policy = convertErasureCodingPolicy(proto);
    ErasureCodingPolicyInfo info = new ErasureCodingPolicyInfo(policy);
    Preconditions.checkArgument(proto.hasState(),
        "Missing state field in ErasureCodingPolicy proto");
    info.setState(convertECState(proto.getState()));  <====
    return info;
  }

  /**
   * Convert a {@link ErasureCodingPolicyInfo} to protobuf.
   * The protobuf will have the policy, and state. State is relevant when:
   * 1. Persisting a policy to fsimage
   * 2. Returning the policy to the RPC call
   * {@link DistributedFileSystem#getAllErasureCodingPolicies()}
   */
  public static ErasureCodingPolicyProto convertErasureCodingPolicy(
      ErasureCodingPolicyInfo info) {
    final ErasureCodingPolicyProto.Builder builder =
        createECPolicyProtoBuilder(info.getPolicy());
    builder.setState(convertECState(info.getState()));  <===
    return builder.build();
  }

{noformat}

Listing Policies:
{noformat}
$ hdfs ec -listPolicies
Erasure Coding Policies:
ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10, numParityUnits=4]],
CellSize=1048576, Id=5], State=DISABLED
ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]],
CellSize=1048576, Id=2], State=ENABLED
ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]],
CellSize=1048576, Id=1], State=ENABLED
ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6,
numParityUnits=3]], CellSize=1048576, Id=3], State=DISABLED
ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2, numParityUnits=1]],
CellSize=1048576, Id=4], State=DISABLED
{noformat}

But, there is another version of {{convertErasureCodingPolicy}} which takes in only {{ErasureCodingPolicy}}
where the state is missing and the default  state from {{ErasureCodingPolicyProto}} will be
used.

{noformat}
  /**
   * Convert a {@link ErasureCodingPolicy} to protobuf.
   * This means no state of the policy will be set on the protobuf.
   */
  public static ErasureCodingPolicyProto convertErasureCodingPolicy(
      ErasureCodingPolicy policy) {
    return createECPolicyProtoBuilder(policy).build();
  }
{noformat}

Probably you are seeing the default value of the EC state from the callers (like ListStatus,
BlockRecovery, BlockGroupChecksum etc.,) of the above convert util. Can you please confirm
where you are seeing the inconsistent EC state? 

> EC Policy defaults incorrectly to enabled in protobufs
> ------------------------------------------------------
>
>                 Key: HDFS-12918
>                 URL: https://issues.apache.org/jira/browse/HDFS-12918
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Zach Amsden
>            Assignee: Manoj Govindassamy
>            Priority: Critical
>
> According to documentation and code comments, the default setting for erasure coding
policy is disabled:
> /** Policy is disabled. It's policy default state. */
>  DISABLED(1),
> However, HDFS-12258 appears to have incorrectly set the policy state in the protobuf
to enabled:
> {code:java}
>  message ErasureCodingPolicyProto {
>     ooptional string name = 1;
>     optional ECSchemaProto schema = 2;
>     optional uint32 cellSize = 3;
>     required uint32 id = 4; // Actually a byte - only 8 bits used
>  + optional ErasureCodingPolicyState state = 5 [default = ENABLED];
>   }
> {code}
> This means the parameter can't actually be optional, it must always be included, and
existing serialized data without this optional field will be incorrectly interpreted as having
erasure coding enabled.
> This unnecessarily breaks compatibility and will require existing HDFS installations
that store metadata in protobufs to require reformatting.
> It looks like a simple mistake that was overlooked in code review.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message