hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart
Date Thu, 23 Jun 2011 23:34:48 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054175#comment-13054175

Allen Wittenauer commented on HDFS-1125:

The problem still seems to be present in 0.20.203, so I'm guessing no, the problem hasn't
been fixed by HDFS-1773.  

How I tested:

a) create a grid with 203, filling in dfs.hosts
b) populate it with data
c) put host in dfs.exclude
d) -refreshNodes, verify host is in decom'ing nodes
e) let decom process finish
f) host now shows up in dead
g) remove host from dfs.host and dfs.exclude
h) -refreshNodes
i) node is still listed as dead by nn
j) kill DataNode process
k) node is still listed as dead by nn
l) 10 mins later, still listed...

> Removing a datanode (failed or decommissioned) should not require a namenode restart
> ------------------------------------------------------------------------------------
>                 Key: HDFS-1125
>                 URL: https://issues.apache.org/jira/browse/HDFS-1125
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.20.2
>            Reporter: Alex Loddengaard
>            Priority: Blocker
> I've heard of several Hadoop users using dfsadmin -report to monitor the number of dead
nodes, and alert if that number is not 0.  This mechanism tends to work pretty well, except
when a node is decommissioned or fails, because then the namenode requires a restart for said
node to be entirely removed from HDFS.  More details here:
> http://markmail.org/search/?q=decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode#query:decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode+page:1+mid:7gwqwdkobgfuszb4+state:results
> Removal from the exclude file and a refresh should get rid of the dead node.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message