Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A140E200C3B for ; Fri, 3 Mar 2017 15:44:52 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9FE66160B80; Fri, 3 Mar 2017 14:44:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EA87A160B6D for ; Fri, 3 Mar 2017 15:44:51 +0100 (CET) Received: (qmail 90499 invoked by uid 500); 3 Mar 2017 14:44:51 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 90409 invoked by uid 99); 3 Mar 2017 14:44:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Mar 2017 14:44:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 8FF2AC25AC for ; Fri, 3 Mar 2017 14:44:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.547 X-Spam-Level: X-Spam-Status: No, score=-1.547 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-2.999, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id MI85ZJdxqN6p for ; Fri, 3 Mar 2017 14:44:49 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 7AE8360E08 for ; Fri, 3 Mar 2017 14:44:49 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7C774E095A for ; Fri, 3 Mar 2017 14:44:47 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 24B4224166 for ; Fri, 3 Mar 2017 14:44:46 +0000 (UTC) Date: Fri, 3 Mar 2017 14:44:46 +0000 (UTC) From: "Duo Zhang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 03 Mar 2017 14:44:52 -0000 [ https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-17712: ------------------------------ Attachment: HBASE-17712-ut.patch The UT to confirm that the compaction on a dead RS will never succeed. > Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound > ----------------------------------------------------------------- > > Key: HBASE-17712 > URL: https://issues.apache.org/jira/browse/HBASE-17712 > Project: HBase > Issue Type: Bug > Affects Versions: 2.0.0, 1.4.0 > Reporter: Duo Zhang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-17712-ut.patch > > > It is introduced in HBASE-13651 and the logic became much more complicated after HBASE-16304 due to a dead lock issue. It is really tough as sequence id is involved in and the method we called is used to serve secondary replica originally which does not handle write. > In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we will write a compaction marker to WAL before deleting the compacted files. We can only consider a RS as dead after its WAL files are all closed so if the region has already been reassigned the compaction will fail as we can not write out the compaction marker. > So theoretically, if we still hit FileNotFound exception, it should be a critical bug which means we may loss data. I do not think it is a good idea to just eat the exception and refresh store files. Or even if we want to do this, we can just refresh store files without dropping memstore contents. This will also simplify the logic a lot. > Suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)