Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Thu, 9 Nov 2017 01:24:00 +0000 (UTC)
From: "Andrew Purtell (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12992573.1469537512000.189313.1510190640781@Atlassian.JIRA>
In-Reply-To: <JIRA.12992573.1469537512000@Atlassian.JIRA>
References: <JIRA.12992573.1469537512000@Atlassian.JIRA> <JIRA.12992573.1469537512913@jira-lw-us.apache.org>
Subject: [jira] [Updated] (HBASE-16287) LruBlockCache size should not exceed
 acceptableSize too many
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Thu, 09 Nov 2017 01:24:08 -0000


     [ https://issues.apache.org/jira/browse/HBASE-16287?page=3Dcom.atlassi=
an.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-16287:
-----------------------------------
    Fix Version/s:     (was: 1.4.0)

> LruBlockCache size should not exceed acceptableSize too many
> ------------------------------------------------------------
>
>                 Key: HBASE-16287
>                 URL: https://issues.apache.org/jira/browse/HBASE-16287
>             Project: HBase
>          Issue Type: Improvement
>          Components: BlockCache
>            Reporter: Yu Sun
>            Assignee: Yu Sun
>             Fix For: 2.0.0, 1.3.0, 1.2.3
>
>         Attachments: HBASE-16287-v1.patch, HBASE-16287-v2.patch, HBASE-16=
287-v3.patch, HBASE-16287-v4.patch, HBASE-16287-v5.patch, HBASE-16287-v6.pa=
tch, HBASE-16287-v7.patch, HBASE-16287-v8.patch, HBASE-16287-v9.patch
>
>
> Our regionserver has a configuation as bellow=EF=BC=9A
>   -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=3D2 -XX:+UseConcMarkSweepGC=20
> also we only use blockcache,and set hfile.block.cache.size =3D 0.3 in hba=
se_site.xml,so under this configuration, the lru block cache size will be(3=
2g-1g)*0.3=3D9.3g. but in some scenarios=EF=BC=8Csome of the rs will occur =
continuous FullGC  for hours and most importantly, after FullGC most of the=
 object in old will not be GCed. so we dump the heap and analyse with MAT a=
nd we observed a obvious memory leak in LruBlockCache, which occpy about 16=
g memory, then we set set class LruBlockCache log level to TRACE and observ=
ed this in log:
> {quote}
> 2016-07-22 12:17:58,158 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlock=
Cache: totalSize=3D15.29 GB, freeSize=3D-5.99 GB, max=3D9.30 GB, blockCount=
=3D628182, accesses=3D101799469125, hits=3D93517800259, hitRatio=3D91.86%, =
, cachingAccesses=3D99462650031, cachingHits=3D93468334621, cachingHitsRati=
o=3D93.97%, evictions=3D238199, evicted=3D4776350518, evictedPerRun=3D20051=
.93359375{quote}
> we can see blockcache size has exceeded acceptableSize too many, which wi=
ll cause the FullGC more seriously.=20
> Afterfter some investigations, I found in this function:
> {code:borderStyle=3Dsolid}
>   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean i=
nMemory,
>       final boolean cacheDataInL1) {
> {code}
> No matter the blockcache size has been used, just put the block into it. =
but if the evict thread is not fast enough, blockcache size will increament=
 significantly.
> So here I think we should have a check, for example, if the blockcache si=
ze > 1.2 * acceptableSize(), just return and dont put into it until the blo=
ckcache size if under watrmark. if this is reasonable, I can make a small p=
atch for this.


--
This message was sent by Atlassian JIRA
(v6.4.14#64029)