cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Kjellman (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions
Date Wed, 24 Aug 2016 05:57:20 GMT


Michael Kjellman commented on CASSANDRA-9754:

So, I'm mostly done with a trunk version of the patch, however, I'm currently focusing on
finishing and polishing the 2.1 based version. Although the abstraction of the index is almost
a total rewrite between 2.1 and trunk the tree itself and the Birch implementation should
remain the same so this certainly isn't wasted time for anyone. :) I've cleaned up the implementation
a bunch, taken care of a bunch of todos and low hanging fruit, added more documentation, and
pushed it to Github to make it a bit easier to make sure the changes apply cleanly.

The following 4 unit tests (out of 1184) are still failing (so close!):
 * org.apache.cassandra.cql3.KeyCacheCqlTest (2 of 2). Need to talk to [~aweisberg] to understand
exactly what these unit tests are testing.
 * org.apache.cassandra.db.ColumnFamilyStoreTest (2 of 38, both related to secondary indexes)

Tomorrow, I hope to push a patch addressing the feedback from [~barnie] (see above comment)
above along with any changes that come out of the code review currently underway by [~jasobrown]
and [~kohlisankalp]. I also need/want to do some work on feeling more comfortable on the upgrade/backwards
compatibility story and make sure there is a good unit test story around that.

[~jjirsa] if you get a chance to take a look please let me know if you have any initial feedback
that would be awesome!

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>                 Key: CASSANDRA-9754
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>             Fix For: 4.x
>         Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects are IndexInfo
and its ByteBuffers. This is specially bad in endpoints with large CQL partitions. If a CQL
partition is say 6,4GB, it will have 100K IndexInfo objects and 200K ByteBuffers. This will
create a lot of churn for GC. Can this be improved by not creating so many objects?

This message was sent by Atlassian JIRA

View raw message