hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode
Date Thu, 07 Sep 2017 18:00:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157346#comment-16157346
] 

Lei (Eddy) Xu commented on HDFS-7859:
-------------------------------------

Hi, [~Sammi]

Thanks for the newly updated patch. It is much more concise. 

Some comments:

{code}
message ErasureCodingPolicyManagerSection {
// number of erasure coding policies, include built-in and user defined
required uint32 numPolicies = 1;
// repeated ErasureCodingPolicyProto policies;
}
{code}

Why dont we just use:

{code}
message ErasureCodingPolicyManagerSection {
   repeated ErasureCodingPolicyProto policies = 1;
}
{code}

The total number of policies should be a very limited number (i.e., 64?). Putting them into
one single protobuf message should be sufficient.


{code}
// dd new erasure coding policy
ECSchema newSchema = new ECSchema("rs", 5, 3);
{code}

The comment above is not clear to me. Could you clarify a little bit?

{code}
MiniDFSCluster cluster = null;
try {
...
} finally {
if (cluster != null) {
  cluster.shutdown();
}
}
{code}

{{MiniDFSCluster}} can be used in {{try-resource}} now. 


In test {{testChangeErasureCodingPolicyState()}}

{code}
 // 3. remove an erasure coding policy
+    try {
+      fs.removeErasureCodingPolicy(ecPolicy.getName());
+      targetPolicy.setState(ErasureCodingPolicyState.REMOVED);
+      // Save namespace and restart NameNode
+      fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER);
+      fs.saveNamespace();
+      fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE);
+
+      cluster.restartNameNodes();
+      cluster.waitActive();
+      ecPolicy = ErasureCodingPolicyManager.getInstance().getByID(
+          targetPolicy.getId());
+      assertEquals("The erasure coding policy is not found",
+          targetPolicy, ecPolicy);
+      assertEquals("The erasure coding policy should be of removed state",
+          ErasureCodingPolicyState.REMOVED, ecPolicy.getState());
+    } catch (RemoteException e) {
+      assertTrue("Built-in policy cannot be removed",
+          ecPolicy.isSystemPolicy());
+      assertExceptionContains("System erasure coding policy", e);
+    }
{code}

The {{assertEquals()}} at the end of this section are not happening if the {{RemoteException}}
is thrown? or vise versa.  Do these tests work as expected?

{code}
+    /*
+     * TODO check if there is any file or directory using this policy.
+     * If no, remove this erasure coding policy from system permanently.
+     */
{code}

Checking file / directory that is using this particular policy is a potentially {{O\(n\)}}
operation, where {{n = # of inodes}}.  I feel that it is OK to leave it in fsimage as garbage
for now. In the future, we can let the fsimage loading process to handling this garbage, as
it is {{O\(n\)}} anyway.

Regarding to the policy ID design, are there general rules for customize EC policy design?
My question is, what is the ID value range can be chosen for a customized policy. Currently
the system EC policies use values up to 5.  If a customer / vender provides a new EC policy
with ID=6,  when the next version of Hadoop adding a new EC policy, how do we handle the conflicts
(i.e, ID=6 has been used), in fsimage and INode.

Question to [~drankye]:
bq. I thought "dfs.namenode.ec.policies.enabled" should have been removed when adding the
API to enable/disable policy. 

Could this happen before BETA 1? it seems to be a breaking change. If not , do we have a plan
to preserve both this key and the capability of adding/removing policies?





> Erasure Coding: Persist erasure coding policies in NameNode
> -----------------------------------------------------------
>
>                 Key: HDFS-7859
>                 URL: https://issues.apache.org/jira/browse/HDFS-7859
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: SammiChen
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-7859.001.patch, HDFS-7859.002.patch, HDFS-7859.004.patch, HDFS-7859.005.patch,
HDFS-7859.006.patch, HDFS-7859.007.patch, HDFS-7859.008.patch, HDFS-7859.009.patch, HDFS-7859.010.patch,
HDFS-7859.011.patch, HDFS-7859.012.patch, HDFS-7859.013.patch, HDFS-7859.014.patch, HDFS-7859.015.patch,
HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we persist EC schemas
in NameNode centrally and reliably, so that EC zones can reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message