Abhilash created HBASE14058:

Summary: Stabilize heap memory tuner
Key: HBASE14058
URL: https://issues.apache.org/jira/browse/HBASE14058
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Abhilash
Assignee: Abhilash
The memory tuner works well in general cases but when we have a work load that is both read
heavy as well as write heavy the tuner does too many tuning. We should try to control the
number of tuner operation and stabilize it. The main problem was that the tuner thinks it
is in steady state even if it sees just one neutral tuner period thus does too many tuning
operations and too many reverts that too with large step sizes(step size was set to maximum
even after one neutral period). So to stop this I have thought of these steps:
1) The division created by μ + δ/2 and μ  δ/2 is too small. Statistically ~62% periods
will lie outside this range, which means 62% of the data points are considered either high
or low which is too much. Use μ + δ*0.8 and μ  δ*0.8 instead. On expectations it will
decrease number of tuner operations per 100 periods from 19 to just 10. If we use δ/2 then
31% of data values will be considered to be high and 31% will be considered to be low (2*0.31
* 0.31 = 0.19), on the other hand if we use δ*0.8 then 22% will be low and 22% will be high(2*0.22*0.22
~ 0.10).
2) Defining proper steady state by looking at past few periods(it is equal to hbase.regionserver.heapmemory.autotuner.lookup.periods)
rather than just last tuner operation. We say tuner is in steady state when last few tuner
periods were NEUTRAL. We keep decreasing step size unless it is extremely low. Then leave
system in that state for some time.
3) Rather then decreasing step size only while reverting, decrease the magnitude of step size
whenever we are trying to revert tuning done in last few periods(sum the changes of last few
periods and compare to current step) rather than just looking at last period. When its magnitude
gets too low then make tuner steps NEUTRAL(no operation). This will cause step size to continuously
decrease unless we reach steady state. After that tuning process will restart (tuner step
size rests again when we reach steady state).
4) The tuning done in last few periods will be decaying sum of past tuner steps with sign.
This parameter will be positive for increase in memstore and negative for increase in block
cache. Rather than using arithmetic mean we use this to give more priority to recent tuner
steps.

This message was sent by Atlassian JIRA
(v6.3.4#6332)
