cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
Date Wed, 31 Dec 2014 22:42:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262472#comment-14262472
] 

Ariel Weisberg commented on CASSANDRA-7438:
-------------------------------------------

I have an in progress response to your earlier comment. I'll address the benchmark here.
 
I wouldn't sweat allocator performance. Ultimately we will have to have our own if only to
accurately enforce memory utilization (user asks for 200 megabytes, we use 400, not cool).
I think the blueprint for how to do this already exists in something like memcached in terms
of how to allocate and defragment. We just need to adapt it for our approach where it is a
pool of independently locked hash tables.

The overhead of copying is where zero deserialization and ref-counting start to be a win since
you don't have to copy at all. I wouldn't get worked up on optimizing for that yet since that
requires upstream to be smarter about how it uses the cache. If upstream can parse the cache
value and extract a subset without copying the entire thing it will handle larger values more
gracefully. At some point upstream might also hold partial rows as well.

I would like to see the ability to spin all cores against the cache, at least for relatively
small values. Not being able to do that is a little concerning. Are threads blocking inside
the allocator? Do the utilization issues occur with large or small values?

I don't have a real baseline with whether these numbers are good or bad. They sound okay and
as you say you would expect the allocator to be one of the slowest parts. I am not sure testing
with 500 threads is realistic since threads have a pretty good chance of being descheduled
while holding a lock and that isn't as likely to happen under real usage conditions. I would
test with say 30 threads on that hardware. 

For say 16k values measuring scaling from 1-30 threads would give us an idea of how well things
are going. That would also give you better feedback on whether different numbers of stripes
help or not.

> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
>                 Key: CASSANDRA-7438
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in JVM heap as
BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better results, but
this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off heap and
use JNI to interact with cache. We might want to ensure that the new implementation match
the existing API's (ICache), and the implementation needs to have safe memory access, low
overhead in memory and less memcpy's (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message