hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2956) HDFS should blacklist datanodes that are not performing well
Date Thu, 06 Mar 2008 22:04:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575914#action_12575914
] 

Runping Qi commented on HADOOP-2956:
------------------------------------

+1

The hard part is to detection.
How do you identify a slow node.
The dfs may be able to detect slow disks by collecting the disk reads/writes performance data,
or by using tools like iostat.
It may use some similar tools to detect slow network links.


> HDFS should blacklist datanodes that are not performing well
> ------------------------------------------------------------
>
>                 Key: HADOOP-2956
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2956
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>
> On a large cluster, a few datanodes could be under-performing. There were cases when
the network connectivity of a few of these bad datanodes were degraded, resulting in long
long times (in the order of two hours) to transfer blocks to and from these datanodes.  
> A similar issue arises when disks a single disk on a datanode fail or change to read-only
mode: in this case the entire datanode shuts down. 
> HDFS should detect and handle network and disk performance degradation more gracefully.
One option would be to blacklist these datanodes, de-prioritise their use and alert the administrator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message