hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brennon Church <bren...@getjar.com>
Subject Lost regions question
Date Fri, 12 Apr 2013 05:50:50 GMT

I had an interesting problem come up recently.  We have a few thousand 
regions across 8 datanode/regionservers.  I made a change, increasing 
the heap size for hadoop from 128M to 2048M which ended up bringing the 
cluster to a complete halt after about 1 hour.  I reverted back to 128M 
and turned things back on again but didn't realize at the time that I 
came up with 9 fewer regions than I started.  Upon further 
investigation, I found that all 9 missing regions were from splits that 
occurred while the cluster was running after making the heap change and 
before it came to a halt.  There was a 10th regions (5 splits involved 
in total) that managed to get recovered.  The really odd thing is that 
in the case of the other 9 regions, the original parent regions, which 
as far as I can tell in the logs were deleted, were re-opened upon 
restarting things once again.  The daughter regions were gone.  
Interestingly, I found the orphaned datablocks still intact, and in at 
least some cases have been able to extract the data from them and will 
hopefully re-add it to the tables.

My question is this.  Does anyone know based on the rather muddled 
description I've given above, what could have possibly happened here?  
My best guess is that the bad state that hdfs was in caused some 
critical component of the split process to be missed, which resulted a 
reference to the parent regions sticking around and losing the 
references to the daughter regions.

Thanks for any insight you can provide.


View raw message