hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Table with 80 regions having nearly no data in it
Date Wed, 17 Dec 2008 20:49:03 GMT
Thibaut_ wrote:
> As a side note, it would be helpfull if it would be possible not only to
> insert or change rows with BatchUpdate, but also to delete rows (So I can
> delete more rows at the end when I'm executing the other batched requests as
> well).
HBASE-880.  Maybe add a vote?

> But something I have noticed is that I have a table (at least one)
> tobeprocessed 	{NAME => 'tobeprocessed', IS_ROOT => 'false', IS_META =>
> 'false', FAMILIES => [{NAME => 'data', BLOOMFILTER => 'false', COMPRESSION
> => 'NONE', VERSIONS => '1', LENGTH => '2147483647', TTL => '-1', IN_MEMORY
> => 'false', BLOCKCACHE => 'false'}], INDEXES => []}
> which spans over 70 regions, but only has about 117 rows in it (just a few
> MByte). 

Did it only ever have 117 rows in it?  Or was it once many more than 
this and other rows were deleted?

> These entries are all in the last region (as I used a timestamp as
> key and I just checked with a mapreduce job). On the webstatus page, there
> are also 2 regions with an empty end key which seems very strange. 

Not 'strange' but 'broken'.  Where do you see that exactly?  Can you 
scan this table successfully?

Scan your .META. and paste in the info:regioninfo cell for each of these 
regions so we can take a look.

> One at
> the end and one near to the middle. When I ran a mapreduce job over this
> table, the region split startkey  is set however to the startkey of the next
> region. (for the first region with an empty end key in the web interface)
> (As a side node, stopping hbase took very long sometimes so I manually
> killed the processes a few times before, which could have led to this...)
OK.  It might have damaged it though 0.19.0 should be more resilient 
than past versions.

> Shouldn't the regions be deleted when no data is present? (as I have set
> versions to 1 and deleted the keys through HTable.deleteAll() function). 

Not currently.  Once made a region remains though after deletes it has 
no data.

You could merge up all of these empty regions but you'd have to shutdown 
hbase and run the merge tool (We should add it to the new UI as an 
option under the new manual split/force-compaction feature).

> Also the startup phase is a lot longer than in hadoop 0.18.1. I have about
> 1500 regions over 7 servers, and it can take up to 5 minutes until all
> regions are loaded. (Hbase doesn't even start to load regions, only when I
> make a first request to it). But this could also be related to corrupt
> regions?
This is being looked into JBA.  Hopefully we can improve here before the 
release.  Study your master log with DEBUG enabled.  Whats it up too?  
There is a new 'safe mode' in hbase.  Maybe this goes on too long.  Is 
it assigning regions?  Are they taking a long time to open?

> Hbase settings:
>  <property>
>     <name>hbase.master.lease.period</name>
>     <value>720000</value>
>     <description>HMaster server lease period in milliseconds. Default is
>     120 seconds.  Region servers must report in within this period else
>     they are considered dead.  On loaded cluster, may need to up this
>     period.</description>
>   </property>

Did you find that you needed it to be this long?

View raw message