hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload
Date Sat, 07 Sep 2013 01:08:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13760836#comment-13760836
] 

stack commented on HBASE-5349:
------------------------------

I can see making the Interface public but do the implementations have to be public too?

-public class LruBlockCache implements BlockCache, HeapSize {
+public class LruBlockCache implements ResizableBlockCache, HeapSize {

They are not instantiated outside of the hfile package?

Call it DefaultHeapMemoryBalancer instead of DefaultHeapMemoryBalancerImpl I would say.

It looks like you change configuration names but you handle the case where users still have
the old names; that is good.

Nit: does the context need both blocked and unblocked? +    boolean memstoreSufficient = blockedFlushCount
== 0 && unblockedFlushCount == 0;

You have this comment twice '// Increase the block cache size and corresponding decrease in
memstore size'  One must be wrong (smile)

Reading DefaultHeapMemoryBalancerImpl, we keep stepping w/o regard for a max. Is max enforced
elsewhere?  If so, will DefaultHeapMemoryBalancerImpl keep asking to step though we are against
the max?  Maybe this is fine.  It makes the implementation 'easier' which is good.  Should
we log when we want to 'step'?  Or is that done outside of DefaultHeapMemoryBalancerImpl (which
would probably be better... again keep it simple)

Does the HeapMemoryManager have to have these two members?

+  private final MemStoreFlusher memStoreFlusher;
+  private final HRegionServer hrs;

Can it not take Interface that has just what it needs?  Else makes it hard to test this new
code in isolation.  At a minimum the HRS can be replaces by Service or RegionServerService
Interface?  Is that possible?  And FlushRequester instead of MemStoreFlusher?

Perhaps drop the 'Auto' prefix from AutoTunerContext and AutoTunerResult.  The users of these
Interfaces don't care if it Auto or not.  Ditto  here: HeapMemoryAutoTuner... call it HeapMemoryTuner.

These should be private since server side?

+@InterfaceAudience.Public
+@InterfaceStability.Evolving
+public interface HeapMemoryBalancer


Should there be a kill switch for the tuner?  You pass in absolute values and once set, it
stops balancing (for the case were the tuner turns pathological and an operator wants to turn
it off while the server is online).  We could do that in another issue np.

Should there be a 'damping' facility?  That is, should we run the check more often and only
make changes if we have been called ten times and on 8 or the 10 times, we judged we should
'step'?  That could be a different implementation I suppose?  Or conversely, do you think
there should be an 'emergency' chain that can be pulled when we need to change the configs
now?  (This latter is probably not a good idea -- at least not yet).

We need to get rid of Chore and have one thread only that does all these tasks -- we have
a load of them running now on each server -- or do them via executor... but that is out of
scope of this issue.

These do not need to be public classes +  public static final class AutoTunerContext { and
AutoTunerResult?

This seems like informational rather than a warn?

+          LOG.warn("Current heap configuration from HeapMemoryBalancer exceeds "
+              + "the threshold required for successful cluster operation. "
+              + "The combined value cannot exceed 0.8. " + MemStoreFlusher.MEMSTORE_SIZE_KEY
+              + " is " + memstoreSize + " and " + HConstants.HFILE_BLOCK_CACHE_SIZE_KEY +
" is "
+              + blockCacheSize);

Make one log rather than two since they change in lockstep:

+          LOG.info("Setting block cache heap size to " + newBlockCacheSize);

...

+          LOG.info("Setting memstore heap size to " + newMemstoreSize);

Do you need to shut it down?  +      startHeapMemoryManager();  Or it doesn't matter?

Does the interface need to be public

+public interface MemstoreFlushListener {

(Lars Francke went through and cleaned the public from all our Interface methods... etc.,
so would be nice not to undo his work).

Needs tests.  Hopefully you can change the above references to MemstoreFlusher and RegionServer
to be Interfaces so you do not need to spin up a cluster to test (you are almost there).

I am a big fan of this patch.  Good work Anoop.  Thanks for doing this.


                
> Automagically tweak global memstore and block cache sizes based on workload
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5349
>                 URL: https://issues.apache.org/jira/browse/HBASE-5349
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Anoop Sam John
>         Attachments: WIP_HBASE-5349.patch
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores)
and Block Cache based on the workload. If you need an image, scroll down at the bottom of
this link: http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message