hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Heitmann <jheitm...@gmail.com>
Subject Re: Region "auto-split"
Date Mon, 11 Jul 2011 18:07:12 GMT
Apologies for the "me too" message, but: me too. I've been seeing this for a month or two on
my trunk-based setup. I looked for log messages, but things looked pretty silent, and when
I looked at the code my impression was identical to Michael's in that I didn't see a plausible
path for it to work outside of extreme circumstances. 

Now that I look at it again, it seems like auto-splitting is initiated only if 1) There are
too many store files per region and 2) The max blocking time hasn't been exceeded. It looks
like the requestSplit call should be hoisted a bit in MemStoreFlusher.flushRegion().


On Jul 11, 2011, at 9:17 AM, Stack wrote:

> If you have an individual storefile > 256M and its not splitting
> something is wrong.  Check the logs on the regionserver carrying your
> single region.  See if any clue therein.
> St.Ack
> On Mon, Jul 11, 2011 at 5:54 AM, Michael Morello
> <michael.morello@gmail.com> wrote:
>> Hi,
>> I use the default values, so hbase.hregion.max.filesize should be 256
>> Mo as defined in HConstants.
>> Regards,
>> 2011/7/11 Zhoushuaifeng <zhoushuaifeng@huawei.com>:
>>> What's your settings of this : hbase.hregion.max.filesize ?
>>> If your setting is more than 1GB, and total size in the region less than this
, your region will not split.
>>> -----Original Message-----
>>> From: Michael Morello [mailto:michael.morello@gmail.com]
>>> Sent: Monday, July 11, 2011 8:28 PM
>>> To: dev@hbase.apache.org
>>> Subject: Region "auto-split"
>>> Hello,
>>> I'm testing the trunk revision of HBase and there is something that
>>> seems strange to me regarding the "auto-split" region feature, here is
>>> my test case :
>>> I generate a lot of data with a simple client into a newly created
>>> table with no predefined region. Everything is stored in the first
>>> "default" region but this region is never splitted (I currently have a
>>> region with a single file store of more than 1Go).
>>> It seems that it is the responsability of the MemStoreFlusher to call
>>> the CompactSplitThread (line 359 of MemStoreFlusher) and schedule a
>>> split, but this part is never called.
>>> More precisely one of the condition that is expected to be true in
>>> order to schedule a split is that isTooManyStoreFiles(region) is true,
>>> this could never happen.
>>> Is it an expected behaviour or is it a known problem ?

View raw message