cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
Date Fri, 28 Nov 2014 17:06:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228397#comment-14228397
] 

Benedict edited comment on CASSANDRA-7438 at 11/28/14 5:06 PM:
---------------------------------------------------------------

I suspect segmenting the table at a coarser granularity, so that each segment is maintained
with mutual exclusivity, would achieve better percentiles in both cases due to keeping the
maximum resize cost down. We could settle for a separate LRU-q per segment, even, to keep
the complexity of this code down significantly - it is unlikely having a global LRU-q is significantly
more accurate at predicting reuse than ~128 of them. It would also make it much easier to
improve the replacement strategy beyond LRU, which would likely yield a bigger win for performance
than any potential loss from reduced concurrency. The critical section for reads could be
kept sufficiently small that competition would be very unlikely with the current state of
C*, by performing the deserialization outside of it. There's a good chance this would yield
a net positive performance impact, by reducing the cost per access without increasing the
cost due to contention measurably (because contention would be infrequent).

edit: coarser, not finer. i.e., a la j.u.c.CHM


was (Author: benedict):
I suspect segmenting the table at a finer granularity, so that each segment is maintained
with mutual exclusivity, would achieve better percentiles in both cases due to keeping the
maximum resize cost down. We could settle for a separate LRU-q per segment, even, to keep
the complexity of this code down significantly - it is unlikely having a global LRU-q is significantly
more accurate at predicting reuse than ~128 of them. It would also make it much easier to
improve the replacement strategy beyond LRU, which would likely yield a bigger win for performance
than any potential loss from reduced concurrency. The critical section for reads could be
kept sufficiently small that competition would be very unlikely with the current state of
C*, by performing the deserialization outside of it. There's a good chance this would yield
a net positive performance impact, by reducing the cost per access without increasing the
cost due to contention measurably (because contention would be infrequent).

> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
>                 Key: CASSANDRA-7438
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in JVM heap as
BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better results, but
this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off heap and
use JNI to interact with cache. We might want to ensure that the new implementation match
the existing API's (ICache), and the implementation needs to have safe memory access, low
overhead in memory and less memcpy's (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message