hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6840) Clients are always sent to the same datanode when read is off rack
Date Wed, 13 Aug 2014 17:32:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095763#comment-14095763
] 

Daryn Sharp commented on HDFS-6840:
-----------------------------------

We believe but haven't proven that this deterministic behavior is causing even more problems.
 Block replication and invalidation appear to be impacted.  As in changing the replication
factor sometimes takes up to an hour to start, and there's a slow but steady increase in blocks
pending deletion on clusters running 2.5.  We believe the NN is repeatedly picking the same
faulty DN to issue the copy block and invalidate block.

> Clients are always sent to the same datanode when read is off rack
> ------------------------------------------------------------------
>
>                 Key: HDFS-6840
>                 URL: https://issues.apache.org/jira/browse/HDFS-6840
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>            Priority: Critical
>
> After HDFS-6268 the sorting order of block locations is deterministic for a given block
and locality level (e.g.: local, rack. off-rack), so off-rack clients all see the same datanode
for the same block.  This leads to very poor behavior in distributed cache localization and
other scenarios where many clients all want the same block data at approximately the same
time.  The one datanode is crushed by the load while the other replicas only handle local
and rack-local requests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message