hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2301) Start/stop appropriate namenode internal services during transition to active and standby
Date Thu, 06 Oct 2011 07:54:31 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121781#comment-13121781
] 

jiraposter@reviews.apache.org commented on HDFS-2301:
-----------------------------------------------------



bq.  On 2011-10-03 18:58:13, Todd Lipcon wrote:
bq.  > just a few nits, mostly looks good. A few questions I have that aren't directly
related to this patch:
bq.  > - is SafeMode now a replicated thing, or does each NN separately enter safemode?
I think the latter, right?
bq.  > - when transitioning between states, what happens if the "enterState" fails for
the new state? The state variable will then indicate it's in that state, when in fact it's
in no state at all. How do we recover from that? We need some kind of rollback? (eg if you're
in standby and try to transition to active, but find that you can't take a lock in ZK)

bq.  is SafeMode now a replicated thing, or does each NN separately enter safemode? I think
the latter, right?
Safemode is the state of namespace(FSNamesystem), unlike active, standby which are the states
of the namenode. Each NN separately enters safemode.

bq.  when transitioning between states, what happens if the "enterState" fails for the new
state? The state variable will then indicate it's in that state, when in fact it's in no state
at all. How do we recover from that? We need some kind of rollback? (eg if you're in standby
and try to transition to active, but find that you can't take a lock in ZK)
This is tricky. Say enterState fails to start services because of some namenode process related
issues. Then most likely rolling back to previous state, and starting services relevant to
previous states will also fail. The particular example you are bringing up related to ZK,
I think failover controller is the one that deals with ZK and not namenode.

I can think of two solutions: namenode shutsdown when this happens (as done during startup)
or move to a failed state.


bq.  On 2011-10-03 18:58:13, Todd Lipcon wrote:
bq.  > branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java,
line 464
bq.  > <https://reviews.apache.org/r/2150/diff/1/?file=47529#file47529line464>
bq.  >
bq.  >     any reason that you switched the order of startHttpServer to the end of this
function? I don't think it's a big deal, but there's some possibility the service plugins
may want to do something with the http server, which wouldn't be started yet.

No particular reason. Not sure who uses ServicePlugins. But the description says it is RPC
related. But will move it back up.


- Suresh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2150/#review2277
-----------------------------------------------------------


On 2011-10-03 18:36:41, Todd Lipcon wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2150/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-03 18:36:41)
bq.  
bq.  
bq.  Review request for hadoop-hdfs and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Uploading Suresh's patch to reviewboard (https://issues.apache.org/jira/secure/attachment/12496953/HDFS-2301.txt
from 29/Sep/11 00:56)
bq.  
bq.  
bq.  This addresses bug HDFS-2301.
bq.      https://issues.apache.org/jira/browse/HDFS-2301
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
1177130 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
1177130 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
1177130 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ActiveState.java
1177128 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/HAContext.java
PRE-CREATION 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/HAState.java
1177128 
bq.    branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyState.java
1177128 
bq.  
bq.  Diff: https://reviews.apache.org/r/2150/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Todd
bq.  
bq.


                
> Start/stop appropriate namenode internal services during transition to active and standby
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-2301
>                 URL: https://issues.apache.org/jira/browse/HDFS-2301
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: HDFS-2301.txt, HDFS-2301.txt, HDFS-2301.txt, HDFS-2301.txt
>
>
> These changes are related to HDFS-1974 which introduced active and standby states. This
jira will address starting and stopping appropriate NN services when entering and existing
active and standby states.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message