Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A4E4E200B58 for ; Wed, 27 Jul 2016 10:33:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A38E8160A6E; Wed, 27 Jul 2016 08:33:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BBEE9160A90 for ; Wed, 27 Jul 2016 10:33:21 +0200 (CEST) Received: (qmail 18193 invoked by uid 500); 27 Jul 2016 08:33:20 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 18147 invoked by uid 99); 27 Jul 2016 08:33:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2016 08:33:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8C6EB2C0D60 for ; Wed, 27 Jul 2016 08:33:20 +0000 (UTC) Date: Wed, 27 Jul 2016 08:33:20 +0000 (UTC) From: "Yu Sun (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16287) BlockCache size should not exceed acceptableSize too many MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 27 Jul 2016 08:33:22 -0000 [ https://issues.apache.org/jira/browse/HBASE-16287?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D153= 95233#comment-15395233 ]=20 Yu Sun commented on HBASE-16287: -------------------------------- {quote} Why -1g? We calc the BC size by conf xmx value * BC percentage. {quote} under this jvm configuation:-Xmn4g -XX:SurvriorRatio=3D2, survrior size wil= l be 4g/(2+1+1)=3D1g, and at any time(except between young gc and some Full= GC(not cms)), at least one of the two survrior is empty, contains no object= s. so if we get max heapsize by jvm, jvm will just return Xmx - one survrio= r size.=20 {code:borderStyle=3Dsolid} public static synchronized BlockCache instantiateBlockCache(Configuration= conf) { if (GLOBAL_BLOCK_CACHE_INSTANCE !=3D null) return GLOBAL_BLOCK_CACHE_IN= STANCE; if (blockCacheDisabled) return null; MemoryUsage mu =3D ManagementFactory.getMemoryMXBean().getHeapMemoryUsa= ge(); LruBlockCache l1 =3D getL1(conf, mu); {code} {code:borderStyle=3Dsolid} static long getLruCacheSize(final Configuration conf, final MemoryUsage m= u) { float cachePercentage =3D conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SI= ZE_KEY, HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT); if (cachePercentage <=3D 0.0001f) { blockCacheDisabled =3D true; return -1; } if (cachePercentage > 1.0) { throw new IllegalArgumentException(HConstants.HFILE_BLOCK_CACHE_SIZE_= KEY + " must be between 0.0 and 1.0, and not > 1.0"); } // Calculate the amount of heap to give the heap. return (long) (mu.getMax() * cachePercentage); } {code} the code above is how hbase compute block cache size, and the keypoint is h= ow mu.getMax() is calculated=E3=80=82 mu itself is returned by the following jni call: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/file/58e586f18da6/src/share/nati= ve/sun/management/MemoryImpl.c {code:borderStyle=3Dsolid} JNIEXPORT jobject JNICALL Java_sun_management_MemoryImpl_getMemoryManagers0 (JNIEnv *env, jclass dummy) { return jmm_interface->GetMemoryManagers(env, NULL); } {code} GetMemoryManagers(env, NULL) is implemented in jvm in file: http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/= vm/services/management.cpp and part of this function implementation is listed bellow: {code:borderStyle=3Dsolid} // Returns a java/lang/management/MemoryUsage object representing // the memory usage for the heap or non-heap memory. JVM_ENTRY(jobject, jmm_GetMemoryUsage(JNIEnv* env, jboolean heap)) ResourceMark rm(THREAD); // Calculate the memory usage size_t total_init =3D 0; size_t total_used =3D 0; size_t total_committed =3D 0; size_t total_max =3D 0; bool has_undefined_init_size =3D false; bool has_undefined_max_size =3D false; .................................. .................................. MemoryUsage usage((heap ? InitialHeapSize : total_init), total_used, total_committed, (heap ? Universe::heap()->max_capacity() : total_max)); Handle obj =3D MemoryService::create_MemoryUsage_obj(usage, CHECK_NULL); return JNIHandles::make_local(env, obj()); JVM_END {code} according to ctor of MemoryUsage, the _maxSize field is initialized by Univ= erse::heap()->max_capacity(), which also implemented in jvm, take CMS gc fo= r example(PS and G1 is almost the same): http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/= vm/memory/genCollectedHeap.cpp {code:borderStyle=3Dsolid} size_t GenCollectedHeap::max_capacity() const { size_t res =3D 0; for (int i =3D 0; i < _n_gens; i++) { res +=3D _gens[i]->max_capacity(); } return res; } {code} in the above code, _n_gens is 2, represent 2 generations(young and old), an= d max_capacity() is a virtual call , for young generation and cms gc, the m= ax_capacity() is implemented in : http://hg.openjdk.java.net/jdk7u/jdk7u/hotspot/file/b9b4bc1e05e2/src/share/= vm/memory/defNewGeneration.cpp {code:borderStyle=3Dsolid} size_t DefNewGeneration::max_capacity() const { const size_t alignment =3D GenCollectedHeap::heap()->collector_policy()->= min_alignment(); const size_t reserved_bytes =3D reserved().byte_size(); return reserved_bytes - compute_survivor_size(reserved_bytes, alignment); {code} reserved_bytes is just Xmn we set, so here we can see jvm calculate young g= en max_capacity by Xmn-one survrior size. actually, in CMS gc ,adaptive policy is disabled explicitly in jvm, so the = two survrior alway of the same this. > BlockCache size should not exceed acceptableSize too many > --------------------------------------------------------- > > Key: HBASE-16287 > URL: https://issues.apache.org/jira/browse/HBASE-16287 > Project: HBase > Issue Type: Improvement > Components: BlockCache > Reporter: Yu Sun > > Our regionserver has a configuation as bellow=EF=BC=9A > -Xmn4g -Xms32g -Xmx32g -XX:SurvriorRatio=3D2 -XX:+UseConcMarkSweepGC=20 > also we only use blockcache,and set hfile.block.cache.size =3D 0.3 in hba= se_site.xml,so under this configuration, the lru block cache size will be(3= 2g-1g)*0.3=3D9.3g. but in some scenarios=EF=BC=8Csome of the rs will occur = continuous FullGC for hours and most importantly, after FullGC most of the= object in old will not be GCed. so we dump the heap and analyse with MAT a= nd we observed a obvious memory leak in LruBlockCache, which occpy about 16= g memory, then we set set class LruBlockCache log level to TRACE and observ= ed this in log: > {quote} > 2016-07-22 12:17:58,158 INFO [LruBlockCacheStatsExecutor] hfile.LruBlock= Cache: totalSize=3D15.29 GB, freeSize=3D-5.99 GB, max=3D9.30 GB, blockCount= =3D628182, accesses=3D101799469125, hits=3D93517800259, hitRatio=3D91.86%, = , cachingAccesses=3D99462650031, cachingHits=3D93468334621, cachingHitsRati= o=3D93.97%, evictions=3D238199, evicted=3D4776350518, evictedPerRun=3D20051= .93359375{quote} > we can see blockcache size has exceeded acceptableSize too many, which wi= ll cause the FullGC more seriously.=20 > Afterfter some investigations, I found in this function: > {code:borderStyle=3Dsolid} > public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean i= nMemory, > final boolean cacheDataInL1) { > {code} > No matter the blockcache size has been used, just put the block into it. = but if the evict thread is not fast enough, blockcache size will increament= significantly. > So here I think we should have a check, for example, if the blockcache si= ze > 1.2 * acceptableSize(), just return and dont put into it until the blo= ckcache size if under watrmark. if this is reasonable, I can make a small p= atch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)