Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Thu, 23 May 2013 22:15:21 +0000 (UTC)
From: "Sergey Shelukhin (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12648988.1369263906832.14190.1369347321047@arcas>
In-Reply-To: <JIRA.12648988.1369263906832@arcas>
References: <JIRA.12648988.1369263906832@arcas>
Subject: [jira] [Updated] (HBASE-8597) compaction record (probably) can
 block WAL cleanup forever if region is closed without edits
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HBASE-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HBASE-8597:
------------------------------------

    Attachment: HBASE-8597-v1.patch

Test was due to HDFS caching the FS object somewhere in the bowels of HDFS, and FileSystem.closeAll doesn't seem to help.
Adding ability to create minicluster with custom number of datanodes, none of these tests really use 2 servers
                
> compaction record (probably) can block WAL cleanup forever if region is closed without edits
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8597
>                 URL: https://issues.apache.org/jira/browse/HBASE-8597
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.95.0
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Critical
>             Fix For: 0.95.1
>
>         Attachments: HBASE-8597-v0.patch, HBASE-8597-v1.patch
>
>
> A region is opened by a server, major compaction is performed, that triggers a split, and the region is closed and split. There's no indication of memstore flush for this region.
> After that, LogRoller repeatedly tries to request the flush of this region due to large number of HLogs, but fails to flush it for hours because the region is not in online regions.
> It seems that what's happening is that when we append entries to WAL we add the first entry after we flush/open some region to "earliest unflushed seqNums per region" map in FSHLog. However, compaction now adds compaction record to WAL, which also affects this map. If the compaction record is the first entry for this region to go into some WAL, and there are no writes to the region after that, there will be no memstore flush and the entry will never be removed. 
> In fact "flushing" for compaction record doesn't make sense, there's no  preservation of the record outside WAL; so, we probably should not add it to "latest unflushed" map.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira