hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split
Date Mon, 19 Aug 2013 04:56:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13743522#comment-13743522

Jerry He commented on HBASE-8760:

Hi, Matteo

Thank you for the time and effort you spent on this JIRA!  There had been more complexity
and problems than anticipated.

I applied HBASE-9207, HBASE-9233, and then the HBASE-8760-0.94-v8.patch on my 0.94 cluster.

I went through a few times the test steps outlined in my previous comment. Sometimes with
minor changes in the steps.

There is one more issue. (Hopefully this is the last one!)
We should not include the offline regions' ServerName in the online snapshot procedure. Otherwise
the snapshot procedure will timeout
while waiting for the obsolete ServerName if the ServerName has been changed, e.g. a re-start.

Attached a 0.94-v8-addendum. It is on top of HBASE-8760-0.94-v8.patch.

After this, I have not seen any failure or exceptions during the testing. 
The row counts always match. The logs are clean without errors or exceptions too.
> possible loss of data in snapshot taken after region split
> ----------------------------------------------------------
>                 Key: HBASE-8760
>                 URL: https://issues.apache.org/jira/browse/HBASE-8760
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 0.94.8, 0.95.1
>            Reporter: Jerry He
>             Fix For: 0.98.0, 0.94.12, 0.96.0
>         Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, HBASE-8760-0.94-v4.patch,
HBASE-8760-0.94-v5.patch, HBASE-8760-0.94-v6.patch, HBASE-8760-0.94-v7.patch, HBASE-8760-0.94-v8-addendum.patch,
HBASE-8760-0.94-v8.patch, HBASE-8760-thz-v0.patch, HBASE-8760-trunk-v8.patch, HBASE-8760-v4.patch,
v4-patch-testing-0.94.zip, v4-patch-testing-0.95.2.zip
> Right after a region split but before the daughter regions are compacted, we have two
daughter regions containing Reference files to the parent hfiles.
> If we take snapshot right at the moment, the snapshot will succeed, but it will only
contain the daughter Reference files. Since there is no hold on the parent hfiles, they will
be deleted by the HFile Cleaner after they are no longer needed by the daughter regions soon
> A minimum we need to do is the keep these parent hfiles from being deleted. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message