hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3908) In HA mode, when there is a ledger in BK missing, which is generated after the last checkpoint, NN can not restore it.
Date Wed, 12 Sep 2012 06:21:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453764#comment-13453764
] 

Vinay commented on HDFS-3908:
-----------------------------

Hi Xiao,

Here is my understanding about the problem:
1. In Non-HA alternative editlogs missed in alternative name dirs, NameNode would start by
loading from available directories.
2. In HA case, if alternative editlogs are missed from shared storage and same is available
in local directories, why cant we load it and start it from there..? Right..?


As of now, this is not supported. Because, NN not sure about all edit lots are available in
Local directories. So NameNode HA design specifies that Shared storage should be HIGHLY AVAILABLE
( I mean without any dataloss ). 

So if any one editlog is missed from shared storage, then it will not select further editlogs
for reading.

*Why Non-HA supports this scenario..?*
   Since local edits can be in multiple directories. So corrupting of one directory should
not be a problem, so it will select and read the edits from alternative directories.

*Why HA doesn't support..?*
  Shared storage is the common storage for both ACTIVE and STANDBY namenodes, all Local edits
present in ACTIVE will differ from STANDBY's local edits. So shared storage should contain
all required editlog always. During STANDBY state, NN will load only from shared storage,
because it doesn't have complete edits in local. During switch-over phase NN will try to load
from both local and shared storage.

In your case, since shared storage (BookKeeper) has missed the first editlog it looking for,
further editlogs are not selected for reading and local edits also have some gaps, so NN got
shutdown.
                
> In HA mode, when there is a ledger in BK missing, which is generated after the last checkpoint,
NN can not restore it.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3908
>                 URL: https://issues.apache.org/jira/browse/HDFS-3908
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.1-alpha
>            Reporter: Han Xiao
>
> If not HA, when the num of edits.dir is larger than 1. Missing of one editlog file in
a dir will not relust problem cause of the replica in the other dir. 
> However, when in HA mode(using BK as ShareStorage), if an ledger missing, the missing
ledger will not restored at the phase of NN starting even if the related editlog file existing
in local dir.
> The missing maintains when NN is still in standby state. However, when the NN enters
active state, it will read the editlog file(related to the missing ledger) in local. But,
unfortunately, the ledger after the missing one in BK can't be readed at such a phase(cause
of gap).
> Therefore in the following situation, editlogs will not be restored even there is an
editlog file either in BK or in local dir: 
> In such a stituation, editlog can't be restored:
> 1、fsiamge file: fsimage_0000000000000005946.md5
> 2、legder in zk:
> 	\[zk: localhost:2181(CONNECTED) 0\] ls /hdfsEdit/ledgers/edits_00000000000000594
> 	edits_000000000000005941_000000000000005942
> 	edits_000000000000005943_000000000000005944
> 	edits_000000000000005945_000000000000005946
> 	edits_000000000000005949_000000000000005949   
> (missing edits_000000000000005947_000000000000005948)
> 3、editlog in local editlog dir:
> 	\-rw-r--r-- 1 root root      30 Sep  8 03:24 edits_0000000000000005947-0000000000000005948
> 	\-rw-r--r-- 1 root root 1048576 Sep  8 03:35 edits_0000000000000005950-0000000000000005950
> 	\-rw-r--r-- 1 root root 1048576 Sep  8 04:42 edits_0000000000000005951-0000000000000005951
> 	(miss edits_0000000000000005949-0000000000000005919)
> 4、and the seen_txid
> 	vm2:/tmp/hadoop-root/dfs/name/current # cat seen_txid
> 	5949
> Here, we want to restored editlog from txid 5946(image) to txid 5949(seen_txid). The
5947-5948 is missing in BK, 5949-5949 is missing in local dir.
> When start the NN, the following exception is thrown:
> 2012-09-08 06:26:10,031 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Error
encountered requiring NN shutdown. Shutting down immediately.
> java.io.IOException: There appears to be a gap in the edit log.  We expected txid 5949,
but got txid 5950.
>         at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:163)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:692)
>         at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:223)
>         at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.catchupDuringFailover(EditLogTailer.java:182)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:599)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1325)
>         at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>         at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
>         at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1233)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:990)
>         at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>         at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:3633)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:924)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)
> 2012-09-08 06:26:10,036 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at vm2/160.161.0.155
> ************************************************************/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message