cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache
Date Fri, 15 Apr 2016 00:03:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242175#comment-15242175
] 

Benedict commented on CASSANDRA-11452:
--------------------------------------

bq. I'd expect the collision would be flushed out by the eviction when we detect that the
victim's and candidates hash codes are equal. To me the victim means the item that the eviction
policy selected, so the jittered LRU is selecting the guard. It might also be simpler code
that method is long to handle the various edge cases.

I think we may be suffering from the ambiguities of the written word.  I thought you meant
to change the jitter to select the victim rather than the guard, i.e. to remove not the LRU.
 If you just mean to calculate the guard earlier then I was raising in invalid contention.

I must admit that since you raise specifically the hash comparison that I don't entirely follow
its logic (I apologise if this is my density; I've not put as much thought into it as I could).
 It seems to me that if the LRU and MRU are colliding, for instance, then we hit the problem
and the comparison does nothing to stop it.  And it doesn't stop two collisions entering the
map unless the collision appears only when the LRU collides with it on admission.  I haven't
looked closely at the test cases so I'm not sure what it's meant to be stopping, but I suspect
the jitter is a stronger more general solution.

bq. Sorry this is existing code in the sketch, as suggested by Thomas Meuller (H2). That was
to protect against hash collision attacks exploiting the hash function.

Ah. Personally I don't see any harm in regularising the bits over the address space with a
random seed - a bit of variance never hurt anybody, and since only tests have a reliable data
distribution, only our benchmarks are likely noticing it in any functional sense.

bq. Unfortunately good traces are also hard to find.

_Any_ traces are hard to find.  The main thing that stopped me exploring some of these ideas
myself over the past few years was the perceived impossibility of finding a suite of good
quality traces.  As much as I am impressed by the paper, when I first encountered W-TInyLFU
I was most excited to see a suite of readily available traces with a simulator.  Of course,
given the bar has been raised for making use of the idea the time investment for doing something
useful has gone up correspondingly, but at least on the fun side of the equation.

bq. I'm really interested to see what other avenues people take to exploit sketches in a cache
policy

Yup.  Me too.

> Cache implementation using LIRS eviction for in-process page cache
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-11452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid having to
explicitly marking compaction accesses as non-cacheable, we need a cache implementation that
uses an eviction algorithm that can better handle non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message