hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack
Date Tue, 12 Aug 2014 01:28:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093613#comment-14093613
] 

Jason Lowe commented on HDFS-6840:
----------------------------------

I think the previous behavior was not deterministic due to this change that was removed in
the HDFS-6268 patch:

{code}
    // put a random node at position 0 if it is not a local/local-rack node
    if(tempIndex == 0 && localRackNode == -1 && nodes.length != 0) {
      swap(nodes, 0, r.nextInt(nodes.length));
{code}

The list used to be mostly deterministic, but the first node in the list (i.e.: the one clients
are likely to be the only one to use) was random.

I have not done the bisect to prove without a doubt it was HDFS-6268, but we've run builds
based on something 2.4.1+ and 2.5 and this behavior is brand-new with 2.5.  There weren't
a lot of changes in the topology sorting arena besides this one between 2.4.1 and 2.5.0, and
the code and JIRA for HDFS-6268 state it's intentionally not randomizing the datanode list
between clients.  Besides the bisect approach I probably can try replacing the network topology
class with the one from before HDFS-6268 and see if the behavior reverts to what it used to
be.

> Clients are always sent to the same datanode when read is off rack
> ------------------------------------------------------------------
>
>                 Key: HDFS-6840
>                 URL: https://issues.apache.org/jira/browse/HDFS-6840
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>            Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a given block
and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode
for the same block.  This leads to very poor behavior in distributed cache localization and
other scenarios where many clients all want the same block data at approximately the same
time.  The one datanode is crushed by the load while the other replicas only handle local
and rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message