Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9034AD383 for ; Thu, 23 May 2013 22:15:21 +0000 (UTC) Received: (qmail 59148 invoked by uid 500); 23 May 2013 22:15:21 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 59102 invoked by uid 500); 23 May 2013 22:15:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 59093 invoked by uid 99); 23 May 2013 22:15:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 May 2013 22:15:21 +0000 Date: Thu, 23 May 2013 22:15:21 +0000 (UTC) From: "Sergey Shelukhin (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-8597) compaction record (probably) can block WAL cleanup forever if region is closed without edits MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-8597: ------------------------------------ Attachment: HBASE-8597-v1.patch Test was due to HDFS caching the FS object somewhere in the bowels of HDFS, and FileSystem.closeAll doesn't seem to help. Adding ability to create minicluster with custom number of datanodes, none of these tests really use 2 servers > compaction record (probably) can block WAL cleanup forever if region is closed without edits > -------------------------------------------------------------------------------------------- > > Key: HBASE-8597 > URL: https://issues.apache.org/jira/browse/HBASE-8597 > Project: HBase > Issue Type: Bug > Affects Versions: 0.95.0 > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Priority: Critical > Fix For: 0.95.1 > > Attachments: HBASE-8597-v0.patch, HBASE-8597-v1.patch > > > A region is opened by a server, major compaction is performed, that triggers a split, and the region is closed and split. There's no indication of memstore flush for this region. > After that, LogRoller repeatedly tries to request the flush of this region due to large number of HLogs, but fails to flush it for hours because the region is not in online regions. > It seems that what's happening is that when we append entries to WAL we add the first entry after we flush/open some region to "earliest unflushed seqNums per region" map in FSHLog. However, compaction now adds compaction record to WAL, which also affects this map. If the compaction record is the first entry for this region to go into some WAL, and there are no writes to the region after that, there will be no memstore flush and the entry will never be removed. > In fact "flushing" for compaction record doesn't make sense, there's no preservation of the record outside WAL; so, we probably should not add it to "latest unflushed" map. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira