hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangduo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
Date Tue, 09 Dec 2014 07:30:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239084#comment-14239084

zhangduo commented on HBASE-10201:

[~stack] I followed RegionSplitPolicy to write FlushPolicy, expect that [~tedyu] suggested
using FlushPolicyFactory and placing the factory method in it instead of FlushPolicy. Maybe
the code of RegionSplitPolicy is old and need refactoring too...

The FlushPolicy api is a little odd. It implements Configured but where do you do a setConf
on it? Then in the configureForRegion method, you take a Region but all it is used for is
to emit region name on Strings and to get instance of HTableDescriptor. The flush takes a
list of stores. Can't it get them from the region it was given when configuredForRegion? This
is a nit comment. Ignore for now.
ReflectionUtils.newInstance(clazz, conf) will call setConf. And I agreed that if we implement
configureForRegion, then the list of stores is not necessary when doing selection. Can be
fixed later.

[~jeffreyz] I think the biggest problem is that this patch change the flushSeqId generation.
flushSeqId will not be bumped if we do not flush all stores. I think the flushSeqId should
be called as "highestFlushedToDiskSeqId" in this patch. And actually I do not know where we
use FlushMarker so I do not know the meaning of flushSeqId in the Marker...


> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>            Priority: Critical
>             Fix For: 1.0.0, 2.0.0, 0.98.9
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch,
HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch,
HBASE-10201_10.patch, HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, HBASE-10201_13.patch,
HBASE-10201_14.patch, HBASE-10201_15.patch, HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch,
HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch,
HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
> Currently the flush decision is made using the aggregate size of all column families.
When large and small column families co-exist, this causes many small flushes of the smaller
CF. We need to make per-CF flush decisions.

This message was sent by Atlassian JIRA

View raw message