cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)
Date Sun, 07 Dec 2014 15:12:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237170#comment-14237170
] 

Robert Stupp commented on CASSANDRA-7438:
-----------------------------------------

Rehashing: hm - at {{o.a.c.db.ColumnFamilyStore#getThroughCache}} (better: {{RowCacheKey}})
we only have the token/key but no (good) hash for the key.

The savings by using a 32 bit hash is about 8 bytes per cache entry (reference-counter field
can then be reduced from 64 bit to 32 bit and still keeping the 8 byte boundaries for key
and value data). But this seems not to have any measurable effect if e.g. jemalloc aligns
allocated memory blocks on bigger page sizes depending on whole cache entry size (e.g. several
kB or MB).

OHC always calculates its own murmur3 hash using the serialized cache key. I _hope_ to achieve
a better distribution across segments and buckets by using 64 bits - but not sure on this.
My preference of using 64 hash bits is basically "it feels better".

> Serializing Row cache alternative (Fully off heap)
> --------------------------------------------------
>
>                 Key: CASSANDRA-7438
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Linux
>            Reporter: Vijay
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in JVM heap as
BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better results, but
this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off heap and
use JNI to interact with cache. We might want to ensure that the new implementation match
the existing API's (ICache), and the implementation needs to have safe memory access, low
overhead in memory and less memcpy's (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message