hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Questions about BucketCache
Date Fri, 27 Feb 2015 18:09:12 GMT
On Fri, Feb 27, 2015 at 1:25 AM, 郝东 <donhoff_h@163.com> wrote:

> Hi,
> I am learning BucketCache with HBase0.98 and have a few questions about
> it. Could anyone help me ?
> 1.Since this kind of Cache divides the memory into many buckets, what is
> the default size of a Bucket? And how to config the size of a Bucket ?

>From CacheConfig, you would use "hbase.bucketcache.bucket.sizes" to specify
bucket sizes. By default, this config is unset so we take on the
DEFAULT_BUCKET_SIZES from BucketAllocator:

  // Default block size is 64K, so we choose more sizes near 64K, you'd

  // reset it according to your cluster's block size distribution

  // TODO Support the view of block size distribution statistics

  private static final int DEFAULT_BUCKET_SIZES[] = { 4 * 1024 + 1024, 8 *
1024 + 1024,

      16 * 1024 + 1024, 32 * 1024 + 1024, 40 * 1024 + 1024, 48 * 1024 +

      56 * 1024 + 1024, 64 * 1024 + 1024, 96 * 1024 + 1024, 128 * 1024 +

      192 * 1024 + 1024, 256 * 1024 + 1024, 384 * 1024 + 1024,

      512 * 1024 + 1024 };

> 2.How to config the total size of the BucketCache?

>From HConstants:


   * When using bucket cache, this is a float that EITHER represents a
percentage of total heap

   * memory size to give to the cache (if < 1.0) OR, it is the capacity in
megabytes of the cache.


  public static final String BUCKET_CACHE_SIZE_KEY =

> 3.Since each Bucket serves for specific size of blocks, and different
> Buckets can serve for different size of blocks, how to setup the sizes that
> they serve for? And what is the default sizes that they serve?

Hmm. Not sure I follow. Please add a little so or maybe better, try it and
watch the UI detail on BucketCache (L2) as it operates?  This I think will
answer the above.

> 4.I see some properties from the reference guide of the Apache HBase
> website , they are hbase.bucketcache.size, hbase.bucketcache.sizes,
> hfile.block.cache.sizes,hfile.block.cache.size, I am totally confused with
> them.  Could you tell me their meaning ?

Sorry they are confusing.  hbase.bucketcache.size is explained above.
hbase.bucketcache.sizes is a doc mistake as is hfile.block.cache.sizes

hfile.block.cache.size is defined below in our hbase-default.xml (see
refguide also).

 744   <property>
 745     <name>hfile.block.cache.size</name>
 746     <value>0.4</value>
 747     <description>Percentage of maximum heap (-Xmx setting) to allocate
to block cache
 748         used by HFile/StoreFile. Default of 0.4 means allocate 40%.
 749         Set to 0 to disable but it's not recommended; you need at least
 750         enough cache to hold the storefile indices.</description>
 751   </property>

The above config is plain when we are running default LRU onheap block
cache; it is how much of the allotted heap to give over to the blockcache.
When BucketCache is enabled, you will have two tiers of caching.  This
config continues to pertain to the LRU onheap block cache only you'll
usually want to shrink it down significantly since it only now holds a
small portion of the cached data (the index and bloom blocks; the data
blocks will be managed by the BucketCache).

> 5.Since BucketCache is usually not on heap, when meeting a crash of
> RegionServer, how does this part of memory is evicted?

If the regionserver exits, this offheap allocation is reclaimed by the
OS... but perhaps you are asking a more subtle question (e.g. what happens
on partial crash.....)

> 6.When BucketCache is nearly full and needs to evict some parts, how does
> it choose which part should be evicted? Does it evict a bucket or a block
> once a time?
It evicts entries in a bucket (not a bucket nor an hfile block). It evicts
all bucket entries that pertain to a file that we are no longer serving
from on file close -- the file was replaced by a compacted file or a region
is being closed -- or we are trying to cache an entry and we have run out
of bucket entries; then we will run the BucketCache#freeSpace method to add
back to the free pool no-longer-used entries, enough so the allocation can
proceed. The cleanup is a little involved because we stripe the BucketCache
with priority tiers. If you would like me to write out the
BucketCache#freeSpace method machinations as narratiive, just say.


> Many Thanks!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message