hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6586) Quarantine Corrupted HFiles
Date Wed, 15 Aug 2012 00:27:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434693#comment-13434693
] 

Jonathan Hsieh commented on HBASE-6586:
---------------------------------------

I basically agree with the dangerous default argument -- which is why I'm suggesting a mode
for it.  (Similar to upgrade mode in a hdfs nn).

Another alternate with what Andy suggested -- making it part of some hbck mode where all hfiles
are checked.  Here is it is definitely admin initiated.

Another suggestion was to recover data from truncated HFiles -- which is something to consider
(but likely won't come until we have an directed need for it).

WRT HDFS-3731 - I've done a few combos of running job that loads data, kill hbase/hdfs in
safe and unsafe ways, and then upgrade but still haven't been able to duplicate same HFile
error.  In these scenarios, I've had both log files and hfiles with block-being-written state
problems.    I believe Jimmy has dealt with the hlog problems with a dist log splitting fix.
 


                
> Quarantine Corrupted HFiles
> ---------------------------
>
>                 Key: HBASE-6586
>                 URL: https://issues.apache.org/jira/browse/HBASE-6586
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jonathan Hsieh
>
> We've encountered a few upgrades from 0.90 hbases + 20.2/1.x hdfs to 0.92 hbases + hdfs
2.x that get stuck.  I haven't been able to duplicate the problem in my dev environment but
we suspect this may be related to HDFS-3731.  On the HBase side, it seems reasonable to quarantine
what are most likely truncated hfiles, so that can could later be recovered.
> Here's an example of the exception we've encountered:
> {code}
> 2012-07-18 05:55:01,152 ERROR handler.OpenRegionHandler (OpenRegionHandler.java:openRegion(346))
- Failed open of region=user_mappings,080112102AA76EF98197605D341B9E6C5824D2BC|1001,1317824890618.eaed0e7abc6d27d28ff0e5a9b49c4c
> 0d. 
> java.io.IOException: java.lang.IllegalArgumentException: Invalid HFile version: 842220600
(expected to be between 1 and 2) 
> at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:306)

> at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:371) 
> at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:387) 
> at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1026)

> at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) 
> at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) 
> at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:286) 
> at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:223) 
> at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534)

> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) 
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3282) 
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3230) 
> at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331)
> at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
> at java.lang.Thread.run(Thread.java:619) 
> Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 842220600 (expected
to be between 1 and 2) 
> at org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:515) 
> at org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:303)

> ... 17 more
> {code}
> Specifically -- the FixedFileTrailer are incorrect, and seemingly missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message