hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6203) check other namenode's state before HAadmin transitionToActive
Date Wed, 09 Apr 2014 18:10:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964490#comment-13964490
] 

Kihwal Lee commented on HDFS-6203:
----------------------------------

bq. For checking other NN's state, if we add the check into the transitionToActive method,
we cannot still guarantee that the other NN will not transition to active after the checking.
Thus I think the checking here will not be very useful.

We can make {{transitionToActive}} do the check by default when automatic fail-over is not
used. If the other NN does not respond or in the active state, the command will fail with
warning. At that point the user can reissue it with a force option, if s/he wants to.  I think
this is a good preventive measure for avoiding the easy-to-make but fatal mistake.

> check other namenode's state before HAadmin transitionToActive
> --------------------------------------------------------------
>
>                 Key: HDFS-6203
>                 URL: https://issues.apache.org/jira/browse/HDFS-6203
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 2.3.0
>            Reporter: patrick white
>
> Current behavior is that the HAadmin -transitionToActive command will complete the transition
to Active even if the other namenode is already in Active state. This is not an allowed condition
and should be handled by fencing, however setting both namenode's active can happen accidentally
with relative ease, especially in a production environment when performing manual maintenance
operations. 
> If this situation does occur it is very serious and will likely cause data loss, or best
case, require a difficult recovery to avoid data loss.
> This is requesting an enhancement to haadmin's -transitionToActive command, to have HAadmin
check the Active state of the other namenode before completing the transition. If the other
namenode is Active, then fail the request due to other nn already-active.
> Not sure if there is a scenario where both namenode's being Active is valid or desired,
but to maintain functional compatibility a 'force' parameter could be added to  override this
check and allow previous behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message