hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase Region splitting may times.
Date Fri, 16 Mar 2012 17:52:02 GMT
On Fri, Mar 16, 2012 at 10:40 AM, hdev ml <hdevml@gmail.com> wrote:
> Does anybody have an answer to this?

Is there a hurry? Have you tried gathering more data about it?

>
>>> > I created a test table with one column family "cf" with 2 columns "a"
>>> and
>>> > "b", each having value of a 3000 character long string. Maximum versions
>>> > allowed is 3 and maxfilesize is at default 256M.
>>> >
>>> > In a loop, I put 100000 rows into it, with 3000 character long values
>>> for
>>> > both a and b. Row key is incremental like row00000000 to row00099999.
>>> >
>>> > I applied an outer loop which will run the above 100000 row put loop, 10
>>> > times.
>>> >
>>> > After running it 10 times, I found that it split into following number
>>> of
>>> > regions for every run.
>>> >
>>> > Run     Regions
>>> > 1            4
>>> > 2            5
>>> > 3            7
>>> > 4           10
>>> > 5           13
>>> > 6           19
>>> > 7           19
>>> > 8           19
>>> > 9           19
>>> > 10          19
>>> >
>>> > Question is, why did it stabilize after the 6th run? Shouldn't it
>>> stabilize

If you let it settle down, does it split later? It might just be that
it was getting behind compactions.

>>> > after 3 runs, because number of versions is 3? After 3 runs, It should
>>> not
>>> > split further, because new versions are being added but old version
>>> should
>>> > be purged/deleted. Is that a correct statement?

No, unless you got lucky and the major compactions ran during the
import, but even then it will run 24h after a region is created.

J-D

Mime
View raw message