hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3065) HA: Newly active NameNode does not recognize decommissioning DataNode
Date Thu, 08 Mar 2012 19:49:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225438#comment-13225438
] 

Todd Lipcon commented on HDFS-3065:
-----------------------------------

I think the solution here is that "refreshNodes" should be special-cased to send the RPC to
all NNs in the cluster, instead of just the active one. Alternatively, we can improve the
dfsadmin docs to indicate that you have to explicitly refresh on all NNs.
                
> HA: Newly active NameNode does not recognize decommissioning DataNode
> ---------------------------------------------------------------------
>
>                 Key: HDFS-3065
>                 URL: https://issues.apache.org/jira/browse/HDFS-3065
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Stephen Chu
>
> I'm working on a cluster where, originally, styx01 hosts the active NameNode and styx02
hosts the standby NameNode. 
> In both styx01's and styx02's exclude file, I added the DataNode on styx03.I then ran
_hdfs dfsadmin -refreshNodes_ and verified on styx01 NN web UI that the DN on styx03 was decommissioning.
After waiting a few minutes, I checked the standby NN web UI (while the DN was decommissioning)
and didn't see that the DN was marked as decommissioning.
> I executed manual failover, making styx02 NN active and styx01 NN standby. I checked
the newly active NN web UI, and the DN was still not marked as decommissioning, even after
a few minutes. However, the newly standby NN's web UI still showed the DN as decommissioning.
> I added another DN to the exclude file, and executed _hdfs dfsadmin -refreshNodes_, but
the styx02 NN web UI still did not update with the decommissioning nodes.
> I failed back over to make styx01 NN active and styx02 NN standby. I checked the styx01
NN web UI and saw that it correctly marked 2 DNs as decommissioning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message