Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE181106B5 for ; Wed, 2 Oct 2013 23:52:43 +0000 (UTC) Received: (qmail 40714 invoked by uid 500); 2 Oct 2013 23:52:42 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 40669 invoked by uid 500); 2 Oct 2013 23:52:42 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 40619 invoked by uid 99); 2 Oct 2013 23:52:42 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 23:52:42 +0000 Date: Wed, 2 Oct 2013 23:52:42 +0000 (UTC) From: "Arpit Gupta (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-5291) Standby namenode after transition to active goes into safemode MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-5291: ------------------------------ Attachment: nn.log > Standby namenode after transition to active goes into safemode > -------------------------------------------------------------- > > Key: HDFS-5291 > URL: https://issues.apache.org/jira/browse/HDFS-5291 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 2.1.1-beta > Reporter: Arpit Gupta > Assignee: Jing Zhao > Priority: Critical > Attachments: nn.log > > > Some log snippets > standby state to active transition > {code} > 2013-10-02 00:13:49,482 INFO ipc.Server (Server.java:run(2068)) - IPC Server handler 69 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from IP:33911 Call#1483 Retry#1: error: org.apache.hadoop.ipc.StandbyException: Operation category WRITE is not supported in state standby > 2013-10-02 00:13:49,689 INFO ipc.Server (Server.java:saslProcess(1342)) - Auth successful for nn/hostname@EXAMPLE.COM (auth:SIMPLE) > 2013-10-02 00:13:49,696 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful for nn/hostname@EXAMPLE.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.ha.HAServiceProtocol > 2013-10-02 00:13:49,700 INFO namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1013)) - Stopping services started for standby state > 2013-10-02 00:13:49,701 WARN ha.EditLogTailer (EditLogTailer.java:doWork(336)) - Edit log tailer interrupted > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1463) > at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:454) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTail > 2013-10-02 00:13:49,704 INFO namenode.FSNamesystem (FSNamesystem.java:startActiveServices(885)) - Starting services required for active state > 2013-10-02 00:13:49,719 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(419)) - Starting recovery process for unclosed journal segments... > 2013-10-02 00:13:49,755 INFO ipc.Server (Server.java:saslProcess(1342)) - Auth successful for hbase/hostname@EXAMPLE.COM (auth:SIMPLE) > 2013-10-02 00:13:49,761 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful for hbase/hostname@EXAMPLE.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol > 2013-10-02 00:13:49,839 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(421)) - Successfully started new epoch 85 > 2013-10-02 00:13:49,839 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(249)) - Beginning recovery of unclosed segment starting at txid 887112 > 2013-10-02 00:13:49,874 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(258)) - Recovery prepare phase complete. Responses: > IP:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530 > 172.18.145.97:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530 > 2013-10-02 00:13:49,875 INFO client.QuorumJournalManager (QuorumJournalManager.java:recover > {code} > And then we get into safemode > {code} > Construction[IP:1019|RBW]]} size 0 > 2013-10-02 00:13:50,277 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap updated: IP:1019 is added to blk_IP157{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[IP:1019|RBW], ReplicaUnderConstruction[172.18.145.96:1019|RBW], ReplicaUnde > rConstruction[IP:1019|RBW]]} size 0 > 2013-10-02 00:13:50,279 INFO hdfs.StateChange (FSNamesystem.java:reportStatus(4703)) - STATE* Safe mode ON. > The reported blocks 1071 needs additional 5 blocks to reach the threshold 1.0000 of total blocks 1075. > Safe mode will be turned off automatically > 2013-10-02 00:13:50,279 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap updated: IP:1019 is added to blk_IP158{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.18.145.99:1019|RBW], ReplicaUnderConstruction[172.18.145.97:1019|RBW], ReplicaUnderConstruction[IP:1019|RBW]]} size 0 > 2013-10-02 00:13:50,280 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap updated: 172.18.145.99:1019 is added to blk_IP158{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.18.145.99:1019|RBW], ReplicaUnderConstruction[172.18.145.97:1019|RBW], ReplicaUnderConstruction[IP:1019|RBW]]} size 0 > 2013-10-02 00:13:50,281 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap updated: 172.18.145.97:1019 is added to blk_IP158{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.18.145.99:1019|RBW], ReplicaUnderConstruction[172.18.145.97:1019|RBW], ReplicaUnderConstruction[IP:1019|RBW]]} size 0 > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)