hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
Date Tue, 28 Oct 2014 16:29:34 GMT
Kihwal Lee created HDFS-7300:
--------------------------------

             Summary: The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
                 Key: HDFS-7300
                 URL: https://issues.apache.org/jira/browse/HDFS-7300
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Kihwal Lee
            Priority: Critical


The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
- Three replicas on two racks. The max is 3, so everything can go to one rack.
- Two replicas on two or more racks. The max is 2, both replicas can end up in the same rack.

{{BlockManager#isNeededReplication()}} fixes this after block/file is closed because {{blockHasEnoughRacks()}}
will return fail.  This is not only extra work, but also can break the favored nodes feature.

When there are two racks and two favored nodes are specified in the same rack, NN may allocate
the third replica on a node in the same rack, because {{maxNodesPerRack}} is 3. When closing
the file, NN moves a block to the other rack. There is 66% chance that a favored node is moved.
 If {{maxNodesPerRack}} was 2, this would not happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message