hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cho ju il <tjst...@kgrid.co.kr>
Subject namenod shutdown. epoch number mismatch
Date Fri, 29 Aug 2014 06:26:50 GMT
 
hadoop version 2.4.1
Namenode shutdown to become epoch number mismatch. 
 
Why suddenly epoch numbers mismatch ?  
Why suddenly namenode shutdown ? 
 
 
*** namenode log 
 
2014-08-26 12:17:48,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:17:48,646 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Scanned 0 directive(s) and 0 block(s) in 21 millisecond(s).
2014-08-26 12:18:03,599 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit
Log from 192.10.1.209
2014-08-26 12:18:03,599 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit
logs
2014-08-26 12:18:03,599 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
segment 22795096
2014-08-26 12:18:03,633 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions:
81 Total time for transactions(ms): 10 Number of transactions batched in Syncs: 0 Number of
syncs: 14 SyncTimes(ms): 159 197
2014-08-26 12:18:03,675 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing
edits file /data/dfs/name/current/edits_inprogress_0000000000022795096 -&gt; /data/dfs/name/current/edits_0000000000022795096-0000000000022795176
2014-08-26 12:18:03,675 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
segment at 22795177
2014-08-26 12:18:05,419 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.10.1.211:40010
is added to blk_1076515119_2774480 size 46226115
2014-08-26 12:18:05,420 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: BLOCK*
processOverReplicatedBlock: Postponing processing of over-replicated blk_1076515119_2774480
since storage + [DISK]DS-405679644-192.10.1.201-40010-1401163823536:NORMALdatanode 192.10.1.201:40010
does not yet have up-to-date block information.
2014-08-26 12:18:18,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:18:18,642 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Scanned 0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:18:48,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:18:48,642 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Scanned 0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:19:18,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:19:18,642 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Scanned 0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:19:48,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:19:48,642 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Scanned 0 directive(s) and 0 block(s) in 17 millisecond(s).
2014-08-26 12:20:04,506 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit
Log from 192.10.1.209
2014-08-26 12:20:04,506 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit
logs
2014-08-26 12:20:04,506 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
segment 22795177
2014-08-26 12:20:04,507 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions:
53 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of
syncs: 1 SyncTimes(ms): 20 67
2014-08-26 12:20:04,544 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions:
53 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of
syncs: 2 SyncTimes(ms): 48 76
2014-08-26 12:20:04,587 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing
edits file /data/dfs/name/current/edits_inprogress_0000000000022795177 -&gt; /data/dfs/name/current/edits_0000000000022795177-0000000000022795229
2014-08-26 12:20:04,587 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log
segment at 22795230
2014-08-26 12:20:18,625 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor:
Rescanning after 30000 milliseconds
2014-08-26 12:20:21,771 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping
services started for active state
2014-08-26 12:20:59,008 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log
segment 22795230
2014-08-26 12:20:59,009 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNodeEditLogRoller
was interrupted, exiting
2014-08-26 12:20:59,014 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager:
Remote journal 192.10.1.209:8485 failed to write txns 22795231-22795235. Will try to write
to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 115 is less than the
last promised epoch 116
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
        at org.apache.hadoop.ipc.Client.call(Client.java:1410)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at $Proxy12.journal(Unknown Source)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolTranslatorPB.journal(QJournalProtocolTranslatorPB.java:167)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:357)
        at org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$7.call(IPCLoggerChannel.java:350)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
2014-08-26 12:20:59,017 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush
failed for required journal (JournalAndStream(mgr=QJM to [10.1.1.107:8485, 192.10.1.208:8485,
192.10.1.209:8485], stream=QuorumOutputStream starting at txid 22795230))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve
quorum size 2/3. 3 exceptions thrown:
192.10.1.208:8485: IPC's epoch 115 is less than the last promised epoch 116
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:414)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:442)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:342)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:148)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 
 
 
2014-08-26 12:20:59,018 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager:
Aborting QuorumOutputStream starting at txid 22795230
2014-08-26 12:20:59,020 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2014-08-26 12:20:59,023 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at namenode194/192.10.1.208
************************************************************/
2014-08-26 12:25:43,591 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
 

Mime
View raw message