hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: 答复: Can region be merged with others automatically when all data in the region has expired and removed ?
Date Wed, 09 Feb 2011 18:13:09 GMT
On Wed, Feb 9, 2011 at 12:59 AM, Zhou Shuaifeng
<zhoushuaifeng@huawei.com> wrote:
> We have test a cluster which have more than 30,000 regions, max size of a region is 512MB.
At this situation, data no more growing, but remove some old data and insert new, and regions
will be more and more.
> This occupies too much heapsize, and will be more if regions cannot be merged. And it
takes too long to make the table offline.
>

I've seen this before where the region size chosen at the start turns
out to be inappropriate -- or the initial config. was missing LZO --
and then at the end of the loading, or during, an adjustment needs to
be made to keep an upper bound on region count.

For example, in Zhou's case above, it sounds like the regions could
have been bigger.

With 30k regions, we can't do manual merges.  A script that does a
survey to pick out adjacent small regions that then does the online
merge up seems like it would be useful.

We also need the ability to do online edits of schema setting region
size and compression without having to take down the table.

Would you mind making an issue Zhou?   It'd be of type umbrella since
there is already effort to do features such as online schema edits.

Thanks,
St.Ack

P.S. What version of hbase Zhou?  Did you have compression enabled?

Mime
View raw message