accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4177) TinyLFU-based BlockCache
Date Tue, 13 Sep 2016 03:07:20 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486074#comment-15486074
] 

Ben Manes commented on ACCUMULO-4177:
-------------------------------------

I only have [JMH benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks] and [simulations|https://github.com/ben-manes/caffeine/wiki/Efficiency]
(hit rates) from published traces. YCSB was difficult because the Zipf distribution is very
easy for any frequency policy to optimize for, so we couldn't prove a benefit on a simple
HBase run.

Feedback from Druid (columnar store) has been very positive. Metamarkets [reported|https://github.com/druid-io/druid/pull/3028]
improved query times, where they were previously unable to use local caching due to high lock
contention. Another user enabled the cache after performance problems with default implementation
(LinkedHashMap) and [reported|https://groups.google.com/forum/#!search/caffeine$20cache|sort:date/druid-user/J-YMqt8wc5s/jSN8AYrOBwAJ]
major improvements. Specifically he stated,

{quote}
Just to tie up this thread, the caffeine cache extension turned out to work very well for
us, even under high eviction rates.

We saw about 50% less GC time while under our peak load, healthier heap usage, and lower overall
query latency across the whole cluster.

Recommended, +1
{quote}

In the HBase patch, TinyLFU is an option to make it easier for evaluation. I think we should
do that here too if someone is willing to canary a node. The current cache is a good specialized
implementation (SLRU with async reaper), so it should do well on the low hanging fruit of
concurrency and frequency. TinyLFU should do better in large workloads, especially with scans,
and might have better GC characteristics by more aggressively discarding recent entries with
a low probability of reuse.

If the cache API was refactored to provide a loading function then it could protect from [dog
piling|https://en.wikipedia.org/wiki/Cache_stampede]. Currently a racy get-compute-put approach
is used, which allows for multiple callers to perform the expensive loading task. This is
too invasive for me to offer in a patch but it could have a good real-world impact.

I know that many projects that use LRU-style cache rely on heuristics to bypass the cache
in case of problematic situations. ElasticSearch's [filter cache|https://www.elastic.co/guide/en/elasticsearch/guide/current/filter-caching.html#_autocaching_behavior]
does this and Postgres has [buffer rings|https://2ndquadrant.com/media/pdfs/talks/InsideBufferCache.pdf]
so that scans don't invalidate the cache. I believe Cassandra bypasses the cache for cases
like compaction, etc. I'd wager that these types of workarounds could be removed with TinyLFU
due to its ability to make smart predictions. That would reduce complexity and might improve
performance when the cache could beneficial.

> TinyLFU-based BlockCache
> ------------------------
>
>                 Key: ACCUMULO-4177
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4177
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Ben Manes
>             Fix For: 2.0.0
>
>         Attachments: ACCUMULO-4177.patch
>
>
> [LruBlockCache|https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/file/blockfile/cache/LruBlockCache.java]
appears to be based on HBase's. I currently have a patch being reviewed in [HBASE-15560|https://issues.apache.org/jira/browse/HBASE-15560]
that replaces the pseudo Segmented LRU with the TinyLFU eviction policy. That should allow
the cache to make [better predictions|https://github.com/ben-manes/caffeine/wiki/Efficiency]
based on frequency and recency, such as improved scan resistance. The implementation uses
[Caffeine|https://github.com/ben-manes/caffeine], the successor to Guava's cache, to provide
concurrency and keep the patch small.
> Full details are in the JIRA ticket. I think it should be easy to port if there is interest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message