Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A141E200B0F for ; Fri, 17 Jun 2016 22:14:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9FB27160A62; Fri, 17 Jun 2016 20:14:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EA09F160A61 for ; Fri, 17 Jun 2016 22:14:07 +0200 (CEST) Received: (qmail 47329 invoked by uid 500); 17 Jun 2016 20:14:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 46925 invoked by uid 99); 17 Jun 2016 20:14:05 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2016 20:14:05 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6581E2C1F6B for ; Fri, 17 Jun 2016 20:14:05 +0000 (UTC) Date: Fri, 17 Jun 2016 20:14:05 +0000 (UTC) From: "Matteo Bertozzi (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-16056) Procedure v2 - fix master crash for FileNotFound MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 17 Jun 2016 20:14:08 -0000 [ https://issues.apache.org/jira/browse/HBASE-16056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-16056: ------------------------------------ Resolution: Fixed Status: Resolved (was: Patch Available) > Procedure v2 - fix master crash for FileNotFound > ------------------------------------------------ > > Key: HBASE-16056 > URL: https://issues.apache.org/jira/browse/HBASE-16056 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 > Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.5 > Reporter: Matteo Bertozzi > Assignee: Matteo Bertozzi > Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6 > > Attachments: HBASE-16056-v0.patch, HBASE-16056-v1.patch, HBASE-16056-v2.patch > > > [~syuanjiang] and [~tedyu] reported a backup master not able to start with FileNotFound during proc-v2 lease recovery. (another restart should have solved the problem) > {noformat} > FileNotFoundException: File does not exist: /hbase/MasterProcWALs/state-000001.log > namenode.INodeFile.valueOf(INodeFile.java:61) at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2877) at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:753) at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:671) > {noformat} > this may happen when the other master is still active (e.g. GC) and tries to remove files while the other master tries to become active. This operation is retryable so the code should able to handle that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)