accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4177) TinyLFU-based BlockCache
Date Tue, 13 Sep 2016 19:50:20 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488214#comment-15488214
] 

Ben Manes commented on ACCUMULO-4177:
-------------------------------------

One approach might be to capture a cache access trace in 1.8.0, and then use my simulator
to observe the hit rate curves. Since the block cache differs from an LRU / SLRU, we could
also look into adding it as a new policy. That wouldn't tell us anything about concurrency,
GC, latencies, etc. but would let us check if there would be improved cache efficiencies in
a real workload.

The trace could be captured as a log of the key hashes (32-bit, 64-bit preferred). Adding
a log statement on a get(k) and redirecting it to a specific appender is pretty non-invasive.
The simulator, etc. is really easy to work with and pretty fast.

I had forgotten that I wrote a patch here too. That did make the cache pluggable in the sense
of feature flagging to ease evaluation and a production rolling out.

> TinyLFU-based BlockCache
> ------------------------
>
>                 Key: ACCUMULO-4177
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4177
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Ben Manes
>             Fix For: 2.0.0
>
>         Attachments: ACCUMULO-4177.patch
>
>
> [LruBlockCache|https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/file/blockfile/cache/LruBlockCache.java]
appears to be based on HBase's. I currently have a patch being reviewed in [HBASE-15560|https://issues.apache.org/jira/browse/HBASE-15560]
that replaces the pseudo Segmented LRU with the TinyLFU eviction policy. That should allow
the cache to make [better predictions|https://github.com/ben-manes/caffeine/wiki/Efficiency]
based on frequency and recency, such as improved scan resistance. The implementation uses
[Caffeine|https://github.com/ben-manes/caffeine], the successor to Guava's cache, to provide
concurrency and keep the patch small.
> Full details are in the JIRA ticket. I think it should be easy to port if there is interest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message