hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
Date Tue, 09 Dec 2014 06:53:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239062#comment-14239062
] 

stack commented on HBASE-10201:
-------------------------------

Here is some feedback on reading through latest version of the patch. Lets address these minor
items after we are sure the most important part is working, the sequenceid handling (I'm running
tests here but its taking a while -- first I need to prove that hbase 1.0 branch is healthy,
then I intro your patch... and [~jeffreyz], Mr SequenceId is going to take a look too).  So
hold on making a patch till testing and Jeffrey's review are done.

Sorry this is taking so long to get in.  On the one hand, you can tell we are excited about
getting this patch in because the improvement is really nice but it touches a very sensitive
part of hbase, the region sequenceid'ing, so we need to exercise extra caution.  Thanks for
you patience [~Apache9]

Nits to be addressed on commit or if you make a new version of the patch (you've done enough
as it is -- smile -- and I could do below on commit np).

I can go over the javadoc on commit. Small edit would fix it all up nicely.

Below is a nit that can be addressed in a follow-on:

This config is not general.  It belongs to a particular policy ("If FlushLargeStoresPolicy
is used..."): hbase.hregion.percolumnfamilyflush.size.lower.bound  Should probably have the
policy it is for in its name.  Maybe just don't mention is in hbase-default.xml.  Let uses
find it if they need it (16MB is a nice default low-bound).

It is odd that this is public:

public static Class<? extends FlushPolicy> getFlushPolicyClass(

It is nice that the master tests that we can load a policy but it does not even use flush
policy (if we fail to load fall back to default with big warning?)  And flush policies are
over in regionserver package so here we have master reaching over and into the regionserver
package. Would be good to avoid doing this x-package reach especially when it does not seem
to be needed.

I would think this would be an internal method for the factory to use?

Also in HTD, you call it getFlushPolicyClassName but here you call it getFlushPolicyClass...
would be good to be same.

This policy stuff you've added is nicer than what was here previous.  Good one.

Should these two strings just be the same?

FLUSH_SIZE_LOWER_BOUND_KEY and DEFAULT_FLUSH_SIZE_LOWER_BOUND even though they are read from
different places?  No harm the key being the same especially since in HTD, you hide the key
by providing getter/setters.

The FlushPolicy api is a little odd.  It implements Configured but where do you do a setConf
on it? Then in the configureForRegion method, you take a Region but all it is used for is
to emit region name on Strings and to get instance of HTableDescriptor.  The flush takes a
list of stores.  Can't it get them from the region it was given when configuredForRegion?
 This is a nit comment.  Ignore for now.

... Stopped at sequence id  changes.... will be back. Thanks.

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>            Priority: Critical
>             Fix For: 1.0.0, 2.0.0, 0.98.9
>
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch,
HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch,
HBASE-10201_10.patch, HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, HBASE-10201_13.patch,
HBASE-10201_14.patch, HBASE-10201_15.patch, HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch,
HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch,
HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
>
>
> Currently the flush decision is made using the aggregate size of all column families.
When large and small column families co-exist, this causes many small flushes of the smaller
CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message