hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload
Date Wed, 18 Apr 2012 23:32:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257085#comment-13257085
] 

Enis Soztutar commented on HBASE-5349:
--------------------------------------

I have been thinking about this, and I think we can have a shot at a simple implementation.
Let me summarize what I have in mind before starting the implementation: 
Goals: 
 - Provide min - max heap percentages for block cache (memstore kind of has it). I think we
should keep max-min sanity bounds, and if they are equal, disable auto-tuning. 
 - enable optimizing the available memory for adaptive workloads (mostly writes during the
day, a lot of reads once MR job starts, etc). For example, when a large write job is started
after ~10 minutes, region servers should tune for write workload. 
Non-goals: 
 - find the optimum mem-utilization algorithm
 - introduce a bunch of other parameters, to get rid of the current ones
 - make it very experimental so that nobody enables it in production. 

Ideally, to optimize the usage of the available memory, we should predict the future workload
(possibly from past workload), and devise a model capturing all the costs associated with
block cache hits / misses, flushes, compactions, etc. But this model will be very complex
to do it properly.

I have checked Hypertable's implementation, and it seems that they check whether the load
is read/write heavy by some hard coded values for the counters, and increment/decrement the
mem limits, much like what Zhihong proposes above. I also want to start with something similar.


Implementation layer: 
 - Currently global memstore limit is a soft limit, we may have to make it a hard limit (blocking
writes)
 - we should enable incrementing / decrementing and setting global memstore and block cache
maximum limits. We do not have live configuration changes, but regardless of auto-tuning,
we should be able to manually set those online. 
 - Periodically we should check past workload (like past 10 min), and depending on whether
it is write heavy or read heavy (from metrics), adjust the mem limits in small intervals.


What do you guys think? Still worth pursuing?
                
> Automagically tweak global memstore and block cache sizes based on workload
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-5349
>                 URL: https://issues.apache.org/jira/browse/HBASE-5349
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.96.0
>
>
> Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores)
and Block Cache based on the workload. If you need an image, scroll down at the bottom of
this link: http://www.hypertable.com/documentation/architecture/
> That'd be one less thing to configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message