hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thibaut (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-420) Adjacent small regions should be automatically merged
Date Mon, 22 Dec 2008 12:06:44 GMT

     [ https://issues.apache.org/jira/browse/HBASE-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Thibaut updated HBASE-420:

This sounds reasonable.

I have a table with data that needs to be processed with tcurrent imestamps used as key. When
the data in the table is being processed (after 30 minutes), all entries are processed and
then deleted. 

When the data in the table grows over two regions, all regions except the last one will never
be used again for future data and will stay empty for ever because the new Timestamps will
be never added to that region

> Adjacent small regions should be automatically merged
> -----------------------------------------------------
>                 Key: HBASE-420
>                 URL: https://issues.apache.org/jira/browse/HBASE-420
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>            Reporter: Bryan Duxbury
>            Priority: Minor
> Region merge functionality exists in HBase today, but merges are triggered manually (in
theory only, because there is no admin tool for doing so). Instead of relying on an admin
to note and merge regions, the Master should detect adjacent undersized regions and automatically
merge them.
> Other than the case when a table has exactly one region, region sizes should always be
between 1/2x and 1x the split size. For instance, if the max file size is 256MB, steady-state,
regions will be between 128 and 256MB. If we find two regions near each other that are less
than some threshold when summed together, they are candidates for merging. For instance, we
could set the threshold to 1/2x max file size, so if one region was 50MB and the other was
16MB, they would be mergeable. 
> The only time that regions small enough to merge should exist is when there have been
significant deletions. Otherwise, regions will always stay in the 1/2 to 1x range. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message