hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Wondering what hbck should do in this situation
Date Wed, 18 Jul 2012 15:53:13 GMT
Hey devs,

I encountered an "interesting" situation with hbck in 0.94, we had
this region which was on HDFS that wasn't in .META. and hbck decided
to include it back:

ERROR: Region { meta => null, hdfs =>
hdfs://sfor3s24:10101/hbase/url_stumble_summary/159952764, deployed =>
 } on HDFS, but not listed in META or deployed on any region server
12/07/17 23:46:03 INFO util.HBaseFsck: Patching .META. with
.regioninfo: {NAME =>
'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
'25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>

Then when it tried to assign the region it got bounced between region servers:

Trying to reassign region...
12/07/17 23:46:04 INFO util.HBaseFsckRepair: Region still in
transition, waiting for it to become assigned: {NAME =>
'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
'25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>
12/07/17 23:46:05 INFO util.HBaseFsckRepair: Region still in
transition, waiting for it to become assigned: {NAME =>
'url_stumble_summary,25467315:2009-12-28,1271922074820', STARTKEY =>
'25467315:2009-12-28', ENDKEY => '25821137:2010-03-08', ENCODED =>

Turns out that this region only contained references (as in post-split
references) to a region that didn't exist anymore so when the region
was being opened it was failing on opening those referenced files:

2012-07-18 00:00:27,454 ERROR
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
open of region=url_stumble_summary,25467315:2009-12-28,1271922074820.159952764,
starting to roll back the global memstore size.
java.io.IOException: java.io.IOException:
java.io.FileNotFoundException: File does not exist:
	at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:550)
	at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:463)
	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3729)
	at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3677)
Caused by: java.io.IOException: java.io.FileNotFoundException: File
does not exist:
	at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:405)
	at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:258)
	at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2918)
Caused by: java.io.FileNotFoundException: File does not exist:
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822)
	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1813)
	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544)
	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
	at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:102)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
	at org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:547)
	at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1252)
	at org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:66)

At first it was confusing me why it was looking for another region
until I saw the HalfStoreFileReader :)

So this is a case where hbck made the cluster worse because the only
way to get rid of this region is to force unassign it, delete it from
.META. and then possibly also delete it from HDFS.

I'm wondering how this could be done better, should we do more checks
when including that sort of region? Like, at least make sure we can
open it? And then what? Just report it?

Thx for reading this far,


View raw message