hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Friedrich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart
Date Mon, 07 Feb 2011 07:08:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991287#comment-12991287

Matthias Friedrich commented on HDFS-1125:

We also got complaints from our admins about this because it makes it really hard to set up
professional monitoring. My company operates close to a 100,000 machines (only a handful Hadoop
nodes though), so it's a big concern that our infrastructure behaves well.

Also, node decommissioning is one of the things QA departments typically test during product

evaluation, so this could hamper Hadoop adoption in some organizations.

> Removing a datanode (failed or decommissioned) should not require a namenode restart
> ------------------------------------------------------------------------------------
>                 Key: HDFS-1125
>                 URL: https://issues.apache.org/jira/browse/HDFS-1125
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.20.2
>            Reporter: Alex Loddengaard
>            Priority: Critical
> I've heard of several Hadoop users using dfsadmin -report to monitor the number of dead
nodes, and alert if that number is not 0.  This mechanism tends to work pretty well, except
when a node is decommissioned or fails, because then the namenode requires a restart for said
node to be entirely removed from HDFS.  More details here:
> http://markmail.org/search/?q=decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode#query:decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode+page:1+mid:7gwqwdkobgfuszb4+state:results
> Removal from the exclude file and a refresh should get rid of the dead node.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message