accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ACCUMULO-4177) TinyLFU-based BlockCache
Date Fri, 16 Sep 2016 21:42:20 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497456#comment-15497456
] 

Ben Manes edited comment on ACCUMULO-4177 at 9/16/16 9:41 PM:
--------------------------------------------------------------

I agree the numbers are too close to judge and falls within the margin of error. The Lru cache
is quite good by not suffering lock contention, delegating the penalties to a background thread,
and being segmented to capture basic frequencies. The YCSB Zipf benchmarks are ideal for it,
as the policy can offer a perfect hit rate and concurrency. Caffeine can do similar with a
small additional overhead due to spreading out the maintenance work for more flexibility and
to avoid O\(n\) operations.

So we can't argue improved concurrency or an improved hit rate (which reduces latencies) for
the Zipf workloads. Instead we can claim to be on par and that there is little to no degredation.
The gain should come in an improved hit rate for real-world workloads, which can be quite
different than synthetic distributions. This might require evaluating on a live cluster, unfortunately.
It might be interesting to capture real cluster traces feed that through YCSB if we wanted
a more robust, repeatable comparison.

Thanks for the help on this. You can add me with no org (since this is a hobby project) on
PST.


was (Author: ben.manes):
I agree the numbers are too close to judge and falls within the margin of error. The Lru cache
is quite good by not suffering lock contention, delegating the penalties to a background thread,
and being segmented to capture basic frequencies. The YCSB Zipf benchmarks are ideal for it,
as the policy can offer a perfect hit rate and concurrency. Caffeine can do similar with a
small additional overhead due to spreading out the maintenance work for more flexibility and
to avoid O(n) operations.

So we can't argue improved concurrency or an improved hit rate (which reduces latencies) for
the Zipf workloads. Instead we can claim to be on par and that there is little to no degredation.
The gain should come in an improved hit rate for real-world workloads, which can be quite
different than synthetic distributions. This might require evaluating on a live cluster, unfortunately.
It might be interesting to capture real cluster traces feed that through YCSB if we wanted
a more robust, repeatable comparison.

Thanks for the help on this. You can add me with no org (since this is a hobby project) on
PST.

> TinyLFU-based BlockCache
> ------------------------
>
>                 Key: ACCUMULO-4177
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4177
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Ben Manes
>            Assignee: Ben Manes
>             Fix For: 2.0.0
>
>         Attachments: ACCUMULO-4177.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> [LruBlockCache|https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/file/blockfile/cache/LruBlockCache.java]
appears to be based on HBase's. I currently have a patch being reviewed in [HBASE-15560|https://issues.apache.org/jira/browse/HBASE-15560]
that replaces the pseudo Segmented LRU with the TinyLFU eviction policy. That should allow
the cache to make [better predictions|https://github.com/ben-manes/caffeine/wiki/Efficiency]
based on frequency and recency, such as improved scan resistance. The implementation uses
[Caffeine|https://github.com/ben-manes/caffeine], the successor to Guava's cache, to provide
concurrency and keep the patch small.
> Full details are in the JIRA ticket. I think it should be easy to port if there is interest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message