cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10855) Use Caffeine (W-TinyLFU) for on-heap caches
Date Sun, 06 Nov 2016 17:12:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15642095#comment-15642095
] 

Ben Manes commented on CASSANDRA-10855:
---------------------------------------

{quote}
I've added .executor(MoreExecutors.directExecutor()) - hope I got your suggestion right.
{quote}

This is good for unit tests to remove asynchronous behavior. My preference is to not use it
in production, especially where latencies matter, by not penalizing callers with maintenance
or removal notification work. Instead deferring that to FJP should help minimize response
times, which I think would be your preference too. I'm not familiar enough with Cassandra's
testing to know whether its trivial to flag the executor. Usually its pretty trivial, especially
when DI like Guice is used.

{quote}
There are a couple of cache.asMap() calls. Would it be an option to eagerly create the AsMapView,
Values and EntrySet instances in LocalAsyncLoadingCache to get around the ternaries in asMap(),
values(), entrySet() and keySet?
{quote}

{{LocalAsyncLoadingCache}} isn't used by Cassandra (a cache that returns {{CompletableFuture}}.
Given the ternaries are null checks to lazily create views, as is common in the Java Collections,
I don't think its a measurable penalty to keep.

{quote}
Do you have some micro-benchmarks in place to actually test against the previous implementation(s)?
{quote}

For concurrent throughput (JMH) see these [benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks].
They show a refinements over CLHM, with a primary benefit in write. Since the cache now supports
memoization, the Cassandra APIs might benefit from using a computation instead of racy _get-load-put_
calls.

For hit rates see these [simulations|https://github.com/ben-manes/caffeine/wiki/Efficiency].
They show W-TinyLFU improves upon LRU by taking into account frequency.

I'll send a PR to your branch when I get a chance to go through the rest of the comments.
Thanks!

> Use Caffeine (W-TinyLFU) for on-heap caches
> -------------------------------------------
>
>                 Key: CASSANDRA-10855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10855
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ben Manes
>              Labels: performance
>         Attachments: CASSANDRA-10855.patch, CASSANDRA-10855.patch
>
>
> Cassandra currently uses [ConcurrentLinkedHashMap|https://code.google.com/p/concurrentlinkedhashmap]
for performance critical caches (key, counter) and Guava's cache for non-critical (auth, metrics,
security). All of these usages have been replaced by [Caffeine|https://github.com/ben-manes/caffeine],
written by the author of the previously mentioned libraries.
> The primary incentive is to switch from LRU policy to W-TinyLFU, which provides [near
optimal|https://github.com/ben-manes/caffeine/wiki/Efficiency] hit rates. It performs particularly
well in database and search traces, is scan resistant, and as adds a very small time/space
overhead to LRU.
> Secondarily, Guava's caches never obtained similar [performance|https://github.com/ben-manes/caffeine/wiki/Benchmarks]
to CLHM due to some optimizations not being ported over. This change results in faster reads
and not creating garbage as a side-effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message