Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 77DC3200C3E for ; Tue, 21 Mar 2017 21:09:46 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 76571160B6E; Tue, 21 Mar 2017 20:09:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BC7F8160B81 for ; Tue, 21 Mar 2017 21:09:45 +0100 (CET) Received: (qmail 83829 invoked by uid 500); 21 Mar 2017 20:09:44 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 83818 invoked by uid 99); 21 Mar 2017 20:09:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Mar 2017 20:09:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 7224618130A for ; Tue, 21 Mar 2017 20:09:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -98.549 X-Spam-Level: X-Spam-Status: No, score=-98.549 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 6_XUwNYMeih4 for ; Tue, 21 Mar 2017 20:09:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 2198E5FD81 for ; Tue, 21 Mar 2017 20:09:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0ADE2E045B for ; Tue, 21 Mar 2017 20:09:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id BE045254C9 for ; Tue, 21 Mar 2017 20:09:41 +0000 (UTC) Date: Tue, 21 Mar 2017 20:09:41 +0000 (UTC) From: "Vladimir Rodionov (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17739) BucketCache is inefficient/wasteful/dumb in its bucket allocations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 21 Mar 2017 20:09:46 -0000 [ https://issues.apache.org/jira/browse/HBASE-17739?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D159= 35224#comment-15935224 ]=20 Vladimir Rodionov commented on HBASE-17739: ------------------------------------------- {quote} What was ur math V? {quote} BucketEntry has actually 80 bytes and BucketCacheKey - 48. Besides you need= to add ConcurrentHashMap Map.Entry overhead for every block in a backing m= ap, which is 48 bytes more =3D> Only for backingMap we have 168 bytes overhead blocksByHFile which is ConcurrentSkipListSet adds another 48 bytes (for Map= .Entry)=20 Total 216 bytes already. That is not 500 as I posted above, just miscalcula= ted some overhead in IdReadWriteLock, but nevertheless - it is quite substa= ntial. =20 > BucketCache is inefficient/wasteful/dumb in its bucket allocations > ------------------------------------------------------------------ > > Key: HBASE-17739 > URL: https://issues.apache.org/jira/browse/HBASE-17739 > Project: HBase > Issue Type: Sub-task > Components: BucketCache > Reporter: stack > > By default we allocate 14 buckets with sizes from 5K to 513K. If lots of = heap given over to bucketcache and say no allocattions made for a particula= r bucket size, this means we have a bunch of the bucketcache that just goes= idle/unused. > For example, say heap is 100G. We'll divide it up among the sizes. If say= we only ever do 5k records, then most of the cache will go unused while th= e allocation for 5k objects will see churn. > Here is an old note of [~anoop.hbase]'s' from a conversation on bucket ca= che we had offlist that describes the issue: > "By default we have those 14 buckets with size range of 5K to 513K. > All sizes will have one bucket (with size 513*4) each except the > last size.. ie. 513K sized many buckets will be there. If we keep on > writing only same sized blocks, we may loose all in btw sized buckets. > Say we write only 4K sized blocks. We will 1st fill the bucket in 5K > size. There is only one such bucket. Once this is filled, we will try > to grab a complete free bucket from other sizes.. But we can not take > it from 9K... 385K sized ones as there is only ONE bucket for these > sizes. We will take only from 513 size.. There are many in that... > So we will eventually take all the buckets from 513 except the last > one.. Ya it has to keep at least one in evey size.. So we will > loose these much size.. They are of no use." > We should set the size type on the fly as the records come in. > Or better, we should choose record size on the fly. Here is another comme= nt from [~anoop.hbase]: > "The second is the biggest contributor. Suppose instead of 4K > sized blocks, the user has 2 K sized blocks.. When we write a block to b= ucket slot, we will reserve size equal to the allocated size for that block= . > So when we write 2K sized blocks (may be actual size a bit more than > 2K ) we will take 5K with each of the block. So u can see that we are > loosing ~3K with every block. Means we are loosing more than half." > He goes on: "If am 100% sure that all my table having 2K HFile block size= , I need to give this config a value 3 * 1024 (Exact 2 K if I give there ma= y be > again problem! That is another story we need to see how we can give > more guarantee for the block size restriction HBASE-15248).. So here als= o ~1K loose for every 2K.. So some thing like a 30% loose !!! :-(=E2=80=9C" > So, we should figure the record sizes ourselves on the fly. > Anything less has us wasting loads of cache space, nvm inefficiences lost= because of how we serialize base types to cache. -- This message was sent by Atlassian JIRA (v6.3.15#6346)