hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11482) Add storage type demand to into DFSNetworkTopology#chooseRandom
Date Wed, 08 Mar 2017 12:33:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901181#comment-15901181
] 

Yiqun Lin edited comment on HDFS-11482 at 3/8/17 12:33 PM:
-----------------------------------------------------------

Thanks [~vagarychen] for working on this!
I take some time to review your patch. Some minor comments from me:

1.I am a little confused for the following logic:
{code}
+    int availableCount = root.getSubtreeStorageCount(type);
+    if (excludeRoot != null && root.isAncestor(excludeRoot)) {
+      if (excludeRoot instanceof DFSTopologyNodeImpl) {
+        availableCount -= ((DFSTopologyNodeImpl)excludeRoot)
+            .getSubtreeStorageCount(type);
+      } else {
+        availableCount -= ((DatanodeDescriptor)excludeRoot)
+            .hasStorageType(type) ? 1 : 0;
+      }
{code}
If {{excludeRoot}} is a DatanodeDescriptor and it has more than one storage of give type,
is right to just subtract {{1}}?
2.Maybe we should use a more safe way {{availableCount <= 0}} to judge if there are available
count storages. Although in the normal case, it should be 0.
{code}
+      }
+    }
+    if (availableCount == 0) {
+      return null;
+    }
{code}
3.There seems one typo: {{the satisfies the requirement}} should be {{that satisfies the requirement}}
{code}
+    // to this point, it is guaranteed that there is at least one node
+    // the satisfies the requirement, keep trying until we found one.  <===== one typo
+    Node chosen;
+    do {
+      chosen = chooseRandomWithStorageTypeAndExcludeRoot(root, excludeRoot,
+          type);
+      if (excludedNodes == null || !excludedNodes.contains(chosen)) {
+        break;
+      } else {
+        LOG.info("Node {} is excluded, continuing.", chosen);   <========= should also
use LOG.debug
+      }
+    } while (true);
+    LOG.debug("chooseRandom returning {}", chosen);   
+    return chosen;
{code}
4.In the above codes, {{LOG.info("Node {}...}} should also be {{LOG.debug}} level.
5.It will be better to reuse {{Random}} instance in class.
{code}
+  private Node chooseRandomWithStorageTypeAndExcludeRoot(
+      DFSTopologyNodeImpl root, Node excludeRoot, StorageType type) {
+    Random r = new Random();    <==== reuse instance in class
+    Node chosenNode;
+    if (root.isRack()) {                    <===== will be null ?
+      // children are datanode descriptor
+      ArrayList<Node> candidates = new ArrayList<>();
+      for (Node node : root.getChildren()) {
+        if (node.equals(excludeRoot)) {
{code}
6. The {{root}} can be null in above codes? Why not add a condition to ensure root isn't null?
7. The method {{chooseRandomWithStorageTypeAndExcludeRoot}} seems too long and not easily
to read. Suggest we can separate some logic to other methods. Like following:
{code}
private Node chooseRandomWithStorageTypeAndExcludeRoot(
+      DFSTopologyNodeImpl root, Node excludeRoot, StorageType type) {
+    Random r = new Random();
+    Node chosenNode;
+    if (root.isRack()) {
+      // children are datanode descriptor
+      ArrayList<Node> candidates = new ArrayList<>();
+      for (Node node : root.getChildren()) {
+        if (node.equals(excludeRoot)) {
+          continue;
+        }
+        DatanodeDescriptor dnDescriptor = (DatanodeDescriptor)node;
+        if (dnDescriptor.hasStorageType(type)) {
+          candidates.add(node);
+        }
+      }
+      if (candidates.size() == 0) {
+        return null;
+      }
+      // to this point, all nodes in candidates are valid choices, and they are
+      // all datanodes, pick a random one.
+      chosenNode = candidates.get(r.nextInt(candidates.size()));
+    } else {
+      // the children are inner nodes
+      ...    <=== here we can use a new method for the case that children are inner nodes
+    }
+    return chosenNode;
+  }
{code}
I haven't looked into test. Will take a look in the next patch if I have time.


was (Author: linyiqun):
Thanks [~vagarychen] for working on this!
I take some time to review your patch. Some minor comments from me:

1.I am a little confused for the following logic:
{code}
+    int availableCount = root.getSubtreeStorageCount(type);
+    if (excludeRoot != null && root.isAncestor(excludeRoot)) {
+      if (excludeRoot instanceof DFSTopologyNodeImpl) {
+        availableCount -= ((DFSTopologyNodeImpl)excludeRoot)
+            .getSubtreeStorageCount(type);
+      } else {
+        availableCount -= ((DatanodeDescriptor)excludeRoot)
+            .hasStorageType(type) ? 1 : 0;
+      }
{code}
If {{excludeRoot}} is a DatanodeDescriptor and it has more than one storage of give type,
is right to just subtract {{1}}?
2.Maybe we should use a more safe way {{availableCount <= 0}} to judge if there are available
count storages. Although in the normal case, it should be 0.
{code}
+      }
+    }
+    if (availableCount == 0) {
+      return null;
+    }
{code}
3.There seems one typo: {{ the satisfies the requirement}} should be {{that satisfies the
requirement}}
{code}
+    // to this point, it is guaranteed that there is at least one node
+    // the satisfies the requirement, keep trying until we found one.  <===== one typo
+    Node chosen;
+    do {
+      chosen = chooseRandomWithStorageTypeAndExcludeRoot(root, excludeRoot,
+          type);
+      if (excludedNodes == null || !excludedNodes.contains(chosen)) {
+        break;
+      } else {
+        LOG.info("Node {} is excluded, continuing.", chosen);   <========= should also
use LOG.debug
+      }
+    } while (true);
+    LOG.debug("chooseRandom returning {}", chosen);   
+    return chosen;
{code}
4.In the above codes, {{LOG.info("Node {}...}} should also be {{LOG.debug}} level.
5.It will be better to reuse {{Random}} instance in class.
{code}
+  private Node chooseRandomWithStorageTypeAndExcludeRoot(
+      DFSTopologyNodeImpl root, Node excludeRoot, StorageType type) {
+    Random r = new Random();    <==== reuse instance in class
+    Node chosenNode;
+    if (root.isRack()) {                    <===== will be null ?
+      // children are datanode descriptor
+      ArrayList<Node> candidates = new ArrayList<>();
+      for (Node node : root.getChildren()) {
+        if (node.equals(excludeRoot)) {
{code}
6. The {{root}} can be null in above codes? Why not add a condition to ensure root isn't null?
7. The method {{chooseRandomWithStorageTypeAndExcludeRoot}} seems too long and not easily
to read. Suggest we can separate some logic to other methods. Like following:
{code}
private Node chooseRandomWithStorageTypeAndExcludeRoot(
+      DFSTopologyNodeImpl root, Node excludeRoot, StorageType type) {
+    Random r = new Random();
+    Node chosenNode;
+    if (root.isRack()) {
+      // children are datanode descriptor
+      ArrayList<Node> candidates = new ArrayList<>();
+      for (Node node : root.getChildren()) {
+        if (node.equals(excludeRoot)) {
+          continue;
+        }
+        DatanodeDescriptor dnDescriptor = (DatanodeDescriptor)node;
+        if (dnDescriptor.hasStorageType(type)) {
+          candidates.add(node);
+        }
+      }
+      if (candidates.size() == 0) {
+        return null;
+      }
+      // to this point, all nodes in candidates are valid choices, and they are
+      // all datanodes, pick a random one.
+      chosenNode = candidates.get(r.nextInt(candidates.size()));
+    } else {
+      // the children are inner nodes
+      ...    <=== here we can use a new method for the case that children are inner nodes
+    }
+    return chosenNode;
+  }
{code}
I haven't looked into test. Will take a look in the next patch if I have time.

> Add storage type demand to into DFSNetworkTopology#chooseRandom
> ---------------------------------------------------------------
>
>                 Key: HDFS-11482
>                 URL: https://issues.apache.org/jira/browse/HDFS-11482
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-11482.001.patch, HDFS-11482.002.patch, HDFS-11482.003.patch,
HDFS-11482.004.patch
>
>
> HDFS-11450 adds storage type info into network topology, with on this info we may change
chooseRandom to take storage type requirement as parameter, only checking subtrees required
storage type available. This way we avoid blindly picking up nodes that are not applicable.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message