hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weihua JIANG <weihua.ji...@gmail.com>
Subject MemStore flush and region split
Date Thu, 23 Jun 2011 12:29:29 GMT
Hi all,

I am studying the region server source code. It seems to me, in HBase
0.92, for each mem store flush, it acts as (in
MemStoreFlusher.flushRegion()):
1. If there are too many store files for a region, hold this flush and
trigger a region compaction or split and wait until
    a) the compaction or split done.
    b) certain period passed (default 90s).
2. Otherwise, perform the flush.


My questions are:
1. If this flush holds and a region split happens, then two new
daughter regions (or even more if a new split happens between this
split and flush)  will replace this region. Then, this region is
invalid (I guess) and daughter regions haven't the content of
memstore. How does this flush on old region affects the new daughter
regions?
2. If maximum wait time (90s) passed and the split is still in
process, then flush is performed on old region while new daughter
regions don't know the existence of this new flushed store file. How
does this storefile merged into daughter regions?
3. 0.92 introduces parallel compaction/split. It seems the only two
ways to trigger region split are manual call and via
MemStoreFlusher.flushRegion(). But, in flushRegion(), the
prerequisition is too many store files. However, if there are enough
minor compaction threads, it is possible that the store file number is
low. Thus, it is possible that though the region size exceeds max file
size greatly, region still can't be splitted. In my experimentation, a
region reaches 7G (max region size is 1G) under heavy update case and
is not be splitted. Have you ever noticed such danger?

Thanks
Weihua

Mime
View raw message