hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance
Date Thu, 24 Mar 2016 17:05:25 GMT
Ming Ma created HDFS-10206:
------------------------------

             Summary: getBlockLocations might not sort datanodes properly by distance
                 Key: HDFS-10206
                 URL: https://issues.apache.org/jira/browse/HDFS-10206
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Ming Ma


If the DFSClient machine is not a datanode, but it shares its rack with some datanodes of
the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} might not put the local-rack
datanodes at the beginning of the sorted list. That is because the function didn't call {{networktopology.add(client);}}
to properly set the node's parent node; something required by {{networktopology.sortByDistance}}
to compute distance between two nodes in the same topology tree.

Another issue with {{networktopology.sortByDistance}} is it only distinguishes local rack
from remote rack, but it doesn't support general distance calculation to tell how remote the
rack is.

{noformat}
NetworkTopology.java
  protected int getWeight(Node reader, Node node) {
    // 0 is local, 1 is same rack, 2 is off rack
    // Start off by initializing to off rack
    int weight = 2;
    if (reader != null) {
      if (reader.equals(node)) {
        weight = 0;
      } else if (isOnSameRack(reader, node)) {
        weight = 1;
      }
    }
    return weight;
  }
{noformat}

HDFS-10203 has suggested moving the sorting from namenode to DFSClient to address another
issue. Regardless of where we do the sorting, we still fix the issues outline here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message