Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 419D3200BF0 for ; Fri, 16 Dec 2016 04:23:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 40330160B2D; Fri, 16 Dec 2016 03:23:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 88D31160B15 for ; Fri, 16 Dec 2016 04:22:59 +0100 (CET) Received: (qmail 86728 invoked by uid 500); 16 Dec 2016 03:22:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 86717 invoked by uid 99); 16 Dec 2016 03:22:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Dec 2016 03:22:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 689922C03DF for ; Fri, 16 Dec 2016 03:22:58 +0000 (UTC) Date: Fri, 16 Dec 2016 03:22:58 +0000 (UTC) From: "Wei-Chiu Chuang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-11254) Standby NameNode may crash during failover if loading edits takes too long MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 16 Dec 2016 03:23:00 -0000 [ https://issues.apache.org/jira/browse/HDFS-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11254: ----------------------------------- Summary: Standby NameNode may crash during failover if loading edits takes too long (was: Failover may fail if loading edits takes too long) > Standby NameNode may crash during failover if loading edits takes too long > -------------------------------------------------------------------------- > > Key: HDFS-11254 > URL: https://issues.apache.org/jira/browse/HDFS-11254 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.6.0 > Reporter: Wei-Chiu Chuang > Priority: Critical > Labels: high-availability > Fix For: 2.9.0, 3.0.0-beta1 > > > We found Standby NameNode crashed when it tried to transition from standby to active. This issue is similar to HDFS-11225 in nature. > The root cause is all IPC threads were blocked, so ZKFC connection to NN timed out. In particular, when it crashed, we saw a few threads blocked on this thread: > {noformat} > Thread 188 (IPC Server handler 25 on 8022): > State: RUNNABLE > Blocked count: 278 > Waited count: 17419 > Stack: > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:886) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuotaRecursively(FSImage.java:887) > org.apache.hadoop.hdfs.server.namenode.FSImage.updateCountForQuota(FSImage.java:875) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:860) > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:827) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:232) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:188) > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$1.run(EditLogTailer.java:182) > java.security.AccessController.doPrivileged(Native Method) > javax.security.auth.Subject.doAs(Subject.java:415) > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:477) > org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:458) > {noformat} > This thread is part of {{FsImage#loadEdits}} when the NameNode failed over. > We also found the following edit logs was rejected after journal node advanced epoch, which implies a failed transitionToActive request. > {noformat} > 10.10.17.1:8485: IPC's epoch 11 is less than the last promised epoch 12 > at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429) > at org.apache.hadoop.hdfs.qjournal.server.Journal.startLogSegment(Journal.java:513) > at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.startLogSegment(JournalNodeRpcServer.java:162) > at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.startLogSegment(QJournalProtocolServerSideTranslatorPB.java:198) > at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25425) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) > at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) > at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142) > at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.startLogSegment(QuorumJournalManager.java:408) > at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.startLogSegment(JournalSet.java:107) > at org.apache.hadoop.hdfs.server.namenode.JournalSet$3.apply(JournalSet.java:222) > at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) > at org.apache.hadoop.hdfs.server.namenode.JournalSet.startLogSegment(JournalSet.java:219) > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.startLogSegment(FSEditLog.java:1206) > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:316) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1265) > at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1767) > at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) > at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64) > at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) > at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1640) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1375) > at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) > at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} > We found the threads that waited on NameNodeRpcServer lock for the transitionToActive thread called {{NameNodeRpcServer#getServiceStatus}}. Is it possible to make this method unsynchronized? Furthermore, is it necessary to synchronize on this object for other NameNodeRpcServer methods? (monitorHealth, transitionToActive, transitionToStandby, getServiceStatus) Is it possible to make {{FSImage.updateCountForQuotaRecursively}} faster? > Despite this issue resulted in failed failover, I am setting priority to critical instead of blocker, because a possible workaround is extending ZKFC socket time out. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org