lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8241) Evaluate W-TinyLfu cache
Date Sat, 08 Oct 2016 18:52:20 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15558521#comment-15558521
] 

Ben Manes commented on SOLR-8241:
---------------------------------

I can take a stab at tests, but its unclear what to include other than basic operations. Otherwise
I'd defer to the library for deeper testing, e.g. scan resistance and efficiency. In those
areas writing tests is for the author to have assurance that the library does what it claims.
I'd prefer if someone obtained production traces instead, which I think would show you an
interesting hit rate curve for how the policies stack up.

I'm pretty sure the current warming, which populates with the hottest entries first, should
be good enough. Since reads dominate writes, the hot entries will quickly have a high frequency
by the time an eviction is triggered. We can try to give the first few hot entries a small
bump too, by adding a few accesses, to add an extra nudge.

> Evaluate W-TinyLfu cache
> ------------------------
>
>                 Key: SOLR-8241
>                 URL: https://issues.apache.org/jira/browse/SOLR-8241
>             Project: Solr
>          Issue Type: Wish
>          Components: search
>            Reporter: Ben Manes
>            Priority: Minor
>         Attachments: SOLR-8241.patch, SOLR-8241.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). The discussions
seem to indicate that the higher hit rate (vs LRU) is offset by the slower performance of
the implementation. An original goal appeared to be to introduce ARC, a patented algorithm
that uses ghost entries to retain history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It uses a frequency
sketch to compactly estimate an entry's popularity. It uses LRU to capture recency and operate
in O(1) time. When using available academic traces the policy provides a near optimal hit
rate regardless of the workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a dependency
on. But, the code is fairly straightforward and a port into Solr's caches instead is a pragmatic
alternative. More interesting is what the impact would be in Solr's workloads and feedback
on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message