hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3744) Decommissioned nodes are included in cluster after switch which is not expected
Date Thu, 02 Aug 2012 16:38:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427431#comment-13427431
] 

Aaron T. Myers commented on HDFS-3744:
--------------------------------------

bq. Persist the NODE_DECOMMISSIONED by active so SNN will get node DECOMMISSIONED.

I'm hesitant to go with this suggestion. How would differences be rectified between what's
persisted in the edit log and what's present in the excluded hosts file?

bq. I would like to with your first opinion with SAFEMODE check in replication(or move replication
to Active service).

I'm afraid I don't follow this. What do safemode or replication have to do with this issue?

I think my preference here is to require the admin to keep the excluded hosts files in sync
across the machines (most deployments already have some way of keeping config files in sync),
and then just make dfsadmin iterate over all configured NNs when the refreshNodes command
is run. There may be other commands that should have similar support. For example, if an administrator
runs `hdfs dfsadmin -safemode enter' and provides a logical HA URI, should only the active
NN be put in SM? Or should both?
                
> Decommissioned nodes are included in cluster after switch which is not expected
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-3744
>                 URL: https://issues.apache.org/jira/browse/HDFS-3744
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
>            Reporter: Brahma Reddy Battula
>
> Scenario:
> =========
> Start ANN and SNN with three DN's
> Exclude DN1 from cluster by using decommission feature 
> (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
> After decommission successful,do switch such that SNN will become Active.
> Here exclude node(DN1) is included in cluster.Able to write files to excluded node since
it's not excluded.
> Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
> decommissioned=0
> One more Observation:
> ====================
> All dfsadmin commands will create proxy only on nn1 irrespective of Active or standby.I
think this also we need to re-look once..
> I am not getting , why we are not given HA for dfsadmin commands..?
> Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message