Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 14 Jul 2015 17:59:05 +0000 (UTC)
From: "Lars Hofhansl (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12844291.1436561965000.186251.1436896745546@Atlassian.JIRA>
In-Reply-To: <JIRA.12844291.1436561965000@Atlassian.JIRA>
References: <JIRA.12844291.1436561965000@Atlassian.JIRA>
 <JIRA.12844291.1436561965954@arcas>
Subject: [jira] [Commented] (HBASE-14058) Stabilizing default heap memory
 tuner
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HBASE-14058?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D146=
26773#comment-14626773 ]=20

Lars Hofhansl commented on HBASE-14058:
---------------------------------------

bq. IMO it wont be a good idea to use it for stat calculation as it will ta=
mper the actual trend and stats.

Hmm... We want to capture the trend and not the current state. I.e. if we t=
rend towards write load for a while then we tune for writeloads, if we tren=
d towards readloads we tune for that, but only after a bit.
Maybe I'm missing something. Lemme actually look at the patch.


> Stabilizing default heap memory tuner
> -------------------------------------
>
>                 Key: HBASE-14058
>                 URL: https://issues.apache.org/jira/browse/HBASE-14058
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 2.0.0, 1.2.0, 1.3.0
>            Reporter: Abhilash
>            Assignee: Abhilash
>         Attachments: HBASE-14058-v1.patch, HBASE-14058.patch, after_modif=
ications.png, before_modifications.png
>
>
> The memory tuner works well in general cases but when we have a work load=
 that is both read heavy as well as write heavy the tuner does too many tun=
ing. We should try to control the number of tuner operation and stabilize i=
t. The main problem was that the tuner thinks it is in steady state even if=
 it sees just one neutral tuner period thus does too many tuning operations=
 and too many reverts that too with large step sizes(step size was set to m=
aximum even after one neutral period). So to stop this I have thought of th=
ese steps:
> 1) The division created by =CE=BC + =CE=B4/2 and =CE=BC - =CE=B4/2 is too=
 small. Statistically ~62% periods will lie outside this range, which means=
 62% of the data points are considered either high or low which is too much=
. Use =CE=BC + =CE=B4*0.8 and =CE=BC - =CE=B4*0.8 instead. On expectations =
it will decrease number of tuner operations per 100 periods from 19 to just=
 10. If we use =CE=B4/2 then 31% of data values will be considered to be hi=
gh and 31% will be considered to be low (2*0.31 * 0.31 =3D 0.19), on the ot=
her hand if we use =CE=B4*0.8 then 22% will be low and 22% will be high(2*0=
.22*0.22 ~ 0.10).
> 2) Defining proper steady state by looking at past few periods(it is equa=
l to hbase.regionserver.heapmemory.autotuner.lookup.periods) rather than ju=
st last tuner operation. We say tuner is in steady state when last few tune=
r periods were NEUTRAL. We keep decreasing step size unless it is extremely=
 low. Then leave system in that state for some time.
> 3) Rather then decreasing step size only while reverting, decrease the ma=
gnitude of step size whenever we are trying to revert tuning done in last f=
ew periods(sum the changes of last few periods and compare to current step)=
 rather than just looking at last period. When its magnitude gets too low t=
hen make tuner steps NEUTRAL(no operation). This will cause step size to co=
ntinuously decrease unless we reach steady state. After that tuning process=
 will restart (tuner step size rests again when we reach steady state).
> 4) The tuning done in last few periods will be decaying sum of past tuner=
 steps with sign. This parameter will be positive for increase in memstore =
and negative for increase in block cache. Rather than using arithmetic mean=
 we use this to give more priority to recent tuner steps.
> Please see the attachments. One represents the size of memstore(green) an=
d size of block cache(blue) adjusted by tuner without these modification an=
d other with the above modifications. The x-axis is time axis and y-axis is=
 the fraction of heap memory available to memstore and block cache at that =
time(it always sums up to 80%). I configured min/max ranges for both compon=
ents to 0.1 and 0.7 respectively(so in the plots the y-axis min and max is =
0.1 and 0.7). In both cases the tuner tries to distribute memory by giving =
~15% to memstore and ~65% to block cache. But the modified one does it much=
 more smoothly.
> I got these results from YCSB test. The test was doing approximately 5000=
 inserts and 500 reads per second (for one region server). The results can =
be further fine tuned and number of tuner operation can be reduced with the=
se changes in configuration.
> For more fine tuning:
> a) lower max step size (suggested =3D 4%)
> b) lower min step size ( default if also fine )
> To further decrease frequency of tuning operations:
> c) increase the number of lookup periods ( in the tests it was just 10, d=
efault is 60 )
> d) increase tuner period ( in the tests it was just 20 secs, default is 6=
0secs)
> I used smaller tuner period/ number of look up periods to get more data p=
oints.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)