hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8760) possible loss of data in snapshot taken after region split
Date Sun, 11 Aug 2013 23:22:48 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13736468#comment-13736468
] 

Jerry He commented on HBASE-8760:
---------------------------------

Some exceptions were seen against 0.95.2 during step 14. It is different from 0.94. But it
could just be due to random timing.
Step 13 was ok. 
Step 14 was ok as successful as well, but there were errors in the logs:
{code}
2013-08-10 23:00:16,463 ERROR [RS_OPEN_REGION-hdtest009:60021-1] handler.OpenRegionHandler:
Failed open of region=TestTable_clone_clone,,1376197826879.c3ea5fba0fe4a49a9e93102d133b99fd.,
starting to roll back the global memstore size.
...
Caused by: java.io.IOException: java.io.FileNotFoundException: Unable to open link: org.apache.hadoop.hbase.io.HFileLink
locations=[hdfs://hdtest009.svl.ibm.com:9000/hbase95/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690,
hdfs://hdtest009.svl.ibm.com:9000/hbase95/.tmp/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690,
hdfs://hdtest009.svl.ibm.com:9000/hbase95/.archive/.data/default/TestTable_clone/9d76f97c231b0ffa4f9ecbe73bfc2acd/info/9af07c31650045d28aa13d8b37251690]
        at org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:448)
        at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:241)
        at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3122)
{code}
Based on the logs, the failed region was a parent region. The daughters regions were ok. Therefore
the end row count was good.
Again if you need relevant logs, I can send to you or attach here.
                
> possible loss of data in snapshot taken after region split
> ----------------------------------------------------------
>
>                 Key: HBASE-8760
>                 URL: https://issues.apache.org/jira/browse/HBASE-8760
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 0.94.8, 0.95.1
>            Reporter: Jerry He
>             Fix For: 0.98.0, 0.95.2, 0.94.12
>
>         Attachments: HBase-8760-0.94.8.patch, HBase-8760-0.94.8-v1.patch, HBASE-8760-0.94-v4.patch,
HBASE-8760-thz-v0.patch, HBASE-8760-thz-v1.patch, HBASE-8760-thz-v2.patch, HBASE-8760-thz-v3.patch,
HBASE-8760-v4.patch
>
>
> Right after a region split but before the daughter regions are compacted, we have two
daughter regions containing Reference files to the parent hfiles.
> If we take snapshot right at the moment, the snapshot will succeed, but it will only
contain the daughter Reference files. Since there is no hold on the parent hfiles, they will
be deleted by the HFile Cleaner after they are no longer needed by the daughter regions soon
after.
> A minimum we need to do is the keep these parent hfiles from being deleted. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message