cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8673) Row cache follow-ups
Date Sun, 25 Jan 2015 12:23:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291076#comment-14291076
] 

Robert Stupp commented on CASSANDRA-8673:
-----------------------------------------

NIT: Row cache size is wrongly reported (reports # of entries instead of occupied capacity)

> Row cache follow-ups
> --------------------
>
>                 Key: CASSANDRA-8673
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8673
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Robert Stupp
>             Fix For: 3.0, 3.1
>
>
> We (Benedict, Ariel and me) had some offline discussion about the next steps to further
improve the row cache committed for CASSANDRA-7438 and identified the following points.
> This ticket is basically a "note" not to forget these topics. The individual points should
be handled in separate (sub) tickets.
> # Permit access to off-heap data without deserialization. This should be the biggest
win to improve reads - effectively no more deserialization of the whole cached value from
off-heap. [OHC issue #2|https://github.com/snazy/ohc/issues/2]
> # Per-table-knob that decides whether changes are updated in the row cache on writes
or not. Could be a win if you have a workload with frequent reads against a few "hot" partitions
but write to many other partitions. Otherwise the row cache would fill up with useless data
and effectively reduce cache hit ratio.
> # Update {{cassandra.sh}} to preload jemalloc using {{LD_PRELOAD}} / {{DYLD_INSERT_LIBRARIES}}
and use {{Unsafe}} for memory allocation. This removes JNA from the call stack. Additionally
we should do this change in existing C* code for the same reason. (Note: JNA adds some overhead
and has a synchronized block in each call going to be fixed in a future version - but it's
not for free.) Feels like a LHF.
> # Investigate whether key cache and counter cache can also use OHC. We could iterate
towards a single cache implementation and maybe remove some code and decrease the potential
number of configurations that can be run.
> # Investigate whether _RowCacheSentinel_ can be replaced with something better / "more
native". RowCacheSentinel's reason seems to be to avoid races with other update operations
that would invalidate the row before it is inserted into the cache. It's a workaround for
it not being write-through.
> # Implement efficient off-heap memory allocator. (see below)
> Not big wins:
> * Allow serialization of hot keys during auto save. Since saving of cached keys is a
task that only runs infrequently (if at all), the win would not be great. It feels like LHF,
but the win is low iMO.
> * Use other replacement strategy. We had some discussion about using something else instead
of LRU (timestamp, 2Q, LIRS, LRU+random). But either the overhead to manage these strategies
overwhelm the benefit or the win would be to low.
> LHFs (should be fixed in the next days)
> * don't use row cache in unit tests (currently enabled in test/conf/cassandra.yaml)
> * don't print whole class path when jemalloc is not available (prints >40k class path
on cassci for each unit text, since jemalloc is not available there - related to previous
point)
> bq. As to incorporating memory management, I think we can actually do this very simply
by merging it with our eviction strategy. If we allocate S arenas of 1/S (where S is the number
of Segments), and partition each arena into pages of size K, we can make our eviction strategy
operate over whole pages, instead of individual items. This probably won't have any significant
impact on eviction, especially with small-ish pages. The only slight complexity is dealing
with allocations spanning multiple pages, but that shouldn't be too tricky. The nice thing
about this approach is that, like our other decisions, it is very easily made obviously correct.
It also gives us great locality for operations, with a high likelihood of cache presence for
each allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message