cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions
Date Sat, 29 Aug 2015 09:47:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721034#comment-14721034
] 

Robert Stupp commented on CASSANDRA-9754:
-----------------------------------------

I think we definitely need better data structures since RIE is neither a good fit for KC nor
index. That's the point in this ticket, CASSANDRA-8931, CASSANDRA-9843 and the pitfall in
current WIP in CASSANDRA-9738.
Not fully agree on relying on page cache due to its granularity (4kB i think) which might
be too coarse for keys. But that depends on the actual data structure - i.e. grouping "hot"
keys per page, which contradicts with immutable sstables.
Another point is the effort to move to thread-per-core model, having distinct and independent
data structures per thread without barriers/locks/whatever - and page-cache is a shared resource.
Next thing is hot and cold data - i.e. we could use bigger intervals (column_index_size_in_kb
in current terminology) for cold data.
TBC: I'm not against page cache or so - just want to note what I think may influence new stuff.

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>
>                 Key: CASSANDRA-9754
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9754
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects are IndexInfo
and its ByteBuffers. This is specially bad in endpoints with large CQL partitions. If a CQL
partition is say 6,4GB, it will have 100K IndexInfo objects and 200K ByteBuffers. This will
create a lot of churn for GC. Can this be improved by not creating so many objects?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message