hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11323) BucketCache all the time!
Date Sat, 14 Jun 2014 16:41:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031616#comment-14031616
] 

Andrew Purtell commented on HBASE-11323:
----------------------------------------

We have various results that all indicate _some_ penalty for using the bucket cache, with
commensurate improvement in GC related metrics. It can be a trade off well worth making but
should not be a global decision. I think we should make combined LRU cache + bucket cache
the default, but only if we can have the default placement for data blocks still be on heap
and plumb down schema level selection of block placement off heap. Then you can have, on a
per CF basis, such strategies as large warm data off heap with block encoding (trade scanning
CPU for serde/copying costs) and smaller hot data on heap with no encoding. At a future time
we could have a few caching strategies like this automatically managed by ergonomics.

> BucketCache all the time!
> -------------------------
>
>                 Key: HBASE-11323
>                 URL: https://issues.apache.org/jira/browse/HBASE-11323
>             Project: HBase
>          Issue Type: Sub-task
>          Components: io
>            Reporter: stack
>             Fix For: 0.99.0
>
>         Attachments: ReportBlockCache.pdf
>
>
> One way to realize the parent issue is to just enable bucket cache all the time; i.e.
always have offheap enabled.  Would have to do some work to make it drop-dead simple on initial
setup (I think it doable).
> So, upside would be the offheap upsides (less GC, less likely to go away and never come
back because of full GC when heap is large, etc.).
> Downside is higher latency.   In Nick's BlockCache 101 there is little to no difference
between onheap and offheap.  In a basic compare doing scans and gets -- details to follow
-- I have BucketCache deploy about 20% less ops than LRUBC when all incache and maybe 10%
less ops when falling out of cache.   I can't tell difference in means and 95th and 99th are
roughly same (more stable with BucketCache).  GC profile is much better with BucketCache --
way less.  BucketCache uses about 7% more user CPU.
> More detail on comparison to follow.
> I think the numbers disagree enough we should probably do the [~lhofhansl] suggestion,
that we allow you to have a table sit in LRUBC, something the current bucket cache layout
does not do.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message