hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-289) HDFS should blacklist datanodes that are not performing well
Date Wed, 30 Jul 2014 08:11:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079048#comment-14079048
] 

Liang Xie commented on HDFS-289:
--------------------------------

We just hit a HBase write performance degradation several days ago, the root cause turns out
is the slow network to/from special datanode due to switch buffer problem. I am now interesting
on implement a simple heuristics excluding DN feature inside DFSOutputStream. will put more
here later:)

> HDFS should blacklist datanodes that are not performing well
> ------------------------------------------------------------
>
>                 Key: HDFS-289
>                 URL: https://issues.apache.org/jira/browse/HDFS-289
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>
> On a large cluster, a few datanodes could be under-performing. There were cases when
the network connectivity of a few of these bad datanodes were degraded, resulting in long
long times (in the order of two hours) to transfer blocks to and from these datanodes.  
> A similar issue arises when disks a single disk on a datanode fail or change to read-only
mode: in this case the entire datanode shuts down. 
> HDFS should detect and handle network and disk performance degradation more gracefully.
One option would be to blacklist these datanodes, de-prioritise their use and alert the administrator.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message