hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Wondering what hbck should do in this situation
Date Wed, 18 Jul 2012 17:17:47 GMT
Adding check on whether the referenced files can be found would help.
If any of the referenced files isn't found, report and don't repair.

Cheers

On Wed, Jul 18, 2012 at 8:53 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> Hey devs,
>
> I encountered an "interesting" situation with hbck in 0.94, we had
> this region which was on HDFS that wasn't in .META. and hbck decided
> to include it back:
>
> ERROR: Region { meta => null, hdfs =>
> hdfs://sfor3s24:10101/hbase/url_stumble_summary/159952764, deployed =>
>  } on HDFS, but not listed in META or deployed on any region server
> 12/07/17 23:46:03 INFO util.HBaseFsck: Patching .META. with
> .regioninfo: {NAME =>
> 'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
> '25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>
> 159952764,}
>
> Then when it tried to assign the region it got bounced between region
> servers:
>
> Trying to reassign region...
> 12/07/17 23:46:04 INFO util.HBaseFsckRepair: Region still in
> transition, waiting for it to become assigned: {NAME =>
> 'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
> '25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>
> 159952764,}
> 12/07/17 23:46:05 INFO util.HBaseFsckRepair: Region still in
> transition, waiting for it to become assigned: {NAME =>
> 'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
> '25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>
> 159952764,}
> etc
>
> Turns out that this region only contained references (as in post-split
> references) to a region that didn't exist anymore so when the region
> was being opened it was failing on opening those referenced files:
>
> 2012-07-18 00:00:27,454 ERROR
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> open of
> region=url_stumble_summary,25467315:2009-12-28,1271922074820.159952764,
> starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException:
> java.io.FileNotFoundException: File does not exist:
> /hbase/url_stumble_summary/208247386/default/2354161894779228084
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:550)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:463)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3729)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3677)
> ...
> Caused by: java.io.IOException: java.io.FileNotFoundException: File
> does not exist:
> /hbase/url_stumble_summary/208247386/default/2354161894779228084
>         at
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:405)
>         at
> org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:258)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2918)
> ...
> Caused by: java.io.FileNotFoundException: File does not exist:
> /hbase/url_stumble_summary/208247386/default/2354161894779228084
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
>         at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1813)
>         at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
>         at
> org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:102)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>         at
> org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:547)
>         at
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1252)
>         at
> org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:66)
> ...
>
>
> At first it was confusing me why it was looking for another region
> until I saw the HalfStoreFileReader :)
>
> So this is a case where hbck made the cluster worse because the only
> way to get rid of this region is to force unassign it, delete it from
> .META. and then possibly also delete it from HDFS.
>
> I'm wondering how this could be done better, should we do more checks
> when including that sort of region? Like, at least make sure we can
> open it? And then what? Just report it?
>
> Thx for reading this far,
>
> J-D
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message