hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhou Shuaifeng <zhoushuaif...@huawei.com>
Subject Re: Can region be merged with others automatically when all data in the region has expired and removed ?
Date Thu, 10 Feb 2011 02:50:44 GMT
My hbase version is 0.90, we have enabled gzip compaction.
What do you mean by "making an issue"?  I really would like to, but I'm
newer to hbase dev. previously, I send mail to user group, but find this
should be an dev issue.
If this is really one thing should be done, can you help?

Zhou Shuaifeng(Frank)

This e-mail and its attachments contain confidential information from
HUAWEI, which 
is intended only for the person or entity whose address is listed above. Any
use of the 
information contained herein in any way (including, but not limited to,
total or partial 
disclosure, reproduction, or dissemination) by persons other than the
recipient(s) is prohibited. If you receive this e-mail in error, please
notify the sender by 
phone or email immediately and delete it!

发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
发送时间: 2011年2月10日 2:13
收件人: dev@hbase.apache.org
抄送: yanlijun@huawei.com
主题: Re: 答复: Can region be merged with others automatically when all data
in the region has expired and removed ?

On Wed, Feb 9, 2011 at 12:59 AM, Zhou Shuaifeng
<zhoushuaifeng@huawei.com> wrote:
> We have test a cluster which have more than 30,000 regions, max size of a
region is 512MB. At this situation, data no more growing, but remove some
old data and insert new, and regions will be more and more.
> This occupies too much heapsize, and will be more if regions cannot be
merged. And it takes too long to make the table offline.

I've seen this before where the region size chosen at the start turns
out to be inappropriate -- or the initial config. was missing LZO --
and then at the end of the loading, or during, an adjustment needs to
be made to keep an upper bound on region count.

For example, in Zhou's case above, it sounds like the regions could
have been bigger.

With 30k regions, we can't do manual merges.  A script that does a
survey to pick out adjacent small regions that then does the online
merge up seems like it would be useful.

We also need the ability to do online edits of schema setting region
size and compression without having to take down the table.

Would you mind making an issue Zhou?   It'd be of type umbrella since
there is already effort to do features such as online schema edits.


P.S. What version of hbase Zhou?  Did you have compression enabled?

View raw message