hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Mundlapudi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1848) Datanodes should shutdown when a critical volume fails
Date Thu, 28 Apr 2011 12:15:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026271#comment-13026271

Bharath Mundlapudi commented on HDFS-1848:

I think, Koji's point is - should we have something like healthchecker in Datanode similar
to Mapreduce? If so, periodically, Datanode launches this healthcheck to determine its health
against disks, nics etc. This was the comment i made earlier. This will help admins. It is
just not sufficient to have diagnostic software on every machine. We need a mechanism to communicate
this information back to Datanode, right? This is required for fail-fast and then fail-stop
safely. By this, Datanode can look after the disks it cares about like today and this external
entity will inform about various other diagnostic information back to Datanode. Agree?


> Datanodes should shutdown when a critical volume fails
> ------------------------------------------------------
>                 Key: HDFS-1848
>                 URL: https://issues.apache.org/jira/browse/HDFS-1848
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Eli Collins
>             Fix For: 0.23.0
> A DN should shutdown when a critical volume (eg the volume that hosts the OS, logs, pid,
tmp dir etc.) fails. The admin should be able to specify which volumes are critical, eg they
might specify the volume that lives on the boot disk. A failure in one of these volumes would
not be subject to the threshold (HDFS-1161) or result in host decommissioning (HDFS-1847)
as the decommissioning process would likely fail.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message