hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-10968) BlockManager#isInNewRack should consider decommissioning nodes
Date Sat, 08 Oct 2016 04:09:20 GMT

     [ https://issues.apache.org/jira/browse/HDFS-10968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsz Wo Nicholas Sze updated HDFS-10968:
---------------------------------------
    Hadoop Flags: Reviewed
     Description: 
For an EC block, it is possible we have enough internal blocks but without enough racks. The
current reconstruction code calls {{BlockManager#isInNewRack}} to check if the target node
can increase the total rack number for the case, which compares the target node's rack with
source node racks:
{code}
    for (DatanodeDescriptor src : srcs) {
      if (src.getNetworkLocation().equals(target.getNetworkLocation())) {
        return false;
      }
    }
{code}
However here the {{srcs}} may include a decommissioning node, in which case we should allow
the target node to be in the same rack with it.

For e.g., suppose we have 11 nodes: h1 ~ h11, which are located in racks r1, r1, r2, r2, r3,
r3, r4, r4, r5, r5, r6, respectively. In case that an EC block has 9 live internal blocks
on (h1~h8 + h11), and one internal block on h9 which is to be decommissioned. The current
code will not choose h10 for reconstruction because isInNewRack thinks h10 is on the same
rack with h9.

  was:
For an EC block, it is possible we have enough internal blocks but without enough racks. The
current reconstruction code calls {{BlockManager#isNewRack}} to check if the target node can
increase the total rack number for the case, which compares the target node's rack with source
node racks:
{code}
    for (DatanodeDescriptor src : srcs) {
      if (src.getNetworkLocation().equals(target.getNetworkLocation())) {
        return false;
      }
    }
{code}
However here the {{srcs}} may include a decommissioning node, in which case we should allow
the target node to be in the same rack with it.

For e.g., suppose we have 11 nodes: h1 ~ h11, which are located in racks r1, r1, r2, r2, r3,
r3, r4, r4, r5, r5, r6, respectively. In case that an EC block has 9 live internal blocks
on (h1~h8 + h11), and one internal block on h9 which is to be decommissioned. The current
code will not choose h10 for reconstruction because isNewRack thinks h10 is on the same rack
with h9.

         Summary: BlockManager#isInNewRack should consider decommissioning nodes  (was: BlockManager#isNewRack
should consider decommissioning nodes)

Good catch!

+1 the patch looks good.

> BlockManager#isInNewRack should consider decommissioning nodes
> --------------------------------------------------------------
>
>                 Key: HDFS-10968
>                 URL: https://issues.apache.org/jira/browse/HDFS-10968
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding, namenode
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-10968.000.patch
>
>
> For an EC block, it is possible we have enough internal blocks but without enough racks.
The current reconstruction code calls {{BlockManager#isInNewRack}} to check if the target
node can increase the total rack number for the case, which compares the target node's rack
with source node racks:
> {code}
>     for (DatanodeDescriptor src : srcs) {
>       if (src.getNetworkLocation().equals(target.getNetworkLocation())) {
>         return false;
>       }
>     }
> {code}
> However here the {{srcs}} may include a decommissioning node, in which case we should
allow the target node to be in the same rack with it.
> For e.g., suppose we have 11 nodes: h1 ~ h11, which are located in racks r1, r1, r2,
r2, r3, r3, r4, r4, r5, r5, r6, respectively. In case that an EC block has 9 live internal
blocks on (h1~h8 + h11), and one internal block on h9 which is to be decommissioned. The current
code will not choose h10 for reconstruction because isInNewRack thinks h10 is on the same
rack with h9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message