hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11323) BucketCache all the time!
Date Sat, 14 Jun 2014 21:55:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031716#comment-14031716

stack commented on HBASE-11323:

Hm.  Maybe I am making this harder than it needs to be (the Lars and now Andrew ask that it
be possible to have tables keep their DATA blocks in LruBlockCache):

The HCD#setInMemory javadoc says:

   * @param inMemory True if we are to keep all values in the HRegionServer cache

So, if I am allowed extrapolate, if a CF has IN_MEMORY set and we are using CombinedBlockCache
-- i.e. LruBC and BucketCache -- then lets just cache the CF DATA blocks in LruBC too?

This small change is all that is needed:

diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java
index 7564cc2..23cdf83 100644
--- a/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java
+++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java
@@ -56,7 +56,7 @@ public class CombinedBlockCache implements BlockCache, HeapSize {
   public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory) {
     boolean isMetaBlock = buf.getBlockType().getCategory() != BlockCategory.DATA;
-    if (isMetaBlock) {
+    if (isMetaBlock || inMemory) {
       lruCache.cacheBlock(cacheKey, buf, inMemory);
     } else {
       bucketCache.cacheBlock(cacheKey, buf, inMemory);

Running with it, I was able to check it was actually working by looking at the block cache
dump by files.  It reports counts of blocks and whether DATA blocks.  Creating a table which
is IN_MEMORY has its data blocks got into LruBC.

Of note, hbase:meta and other system tables now will have their DATA blocks up in LruBC too
since they are marked as IN_MEMORY.

If the above is allowed, I'll go through and amend all references to IN_MEMORY to make note
of this expanded definition of its meaning.

> BucketCache all the time!
> -------------------------
>                 Key: HBASE-11323
>                 URL: https://issues.apache.org/jira/browse/HBASE-11323
>             Project: HBase
>          Issue Type: Sub-task
>          Components: io
>            Reporter: stack
>             Fix For: 0.99.0
>         Attachments: ReportBlockCache.pdf
> One way to realize the parent issue is to just enable bucket cache all the time; i.e.
always have offheap enabled.  Would have to do some work to make it drop-dead simple on initial
setup (I think it doable).
> So, upside would be the offheap upsides (less GC, less likely to go away and never come
back because of full GC when heap is large, etc.).
> Downside is higher latency.   In Nick's BlockCache 101 there is little to no difference
between onheap and offheap.  In a basic compare doing scans and gets -- details to follow
-- I have BucketCache deploy about 20% less ops than LRUBC when all incache and maybe 10%
less ops when falling out of cache.   I can't tell difference in means and 95th and 99th are
roughly same (more stable with BucketCache).  GC profile is much better with BucketCache --
way less.  BucketCache uses about 7% more user CPU.
> More detail on comparison to follow.
> I think the numbers disagree enough we should probably do the [~lhofhansl] suggestion,
that we allow you to have a table sit in LRUBC, something the current bucket cache layout
does not do.

This message was sent by Atlassian JIRA

View raw message