cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions
Date Mon, 11 Apr 2016 14:15:25 GMT


Jack Krupansky commented on CASSANDRA-9754:

Any idea how a new wide partition will perform relative to the same amount of data and same
number of clustering rows divided into bucketed partitions? For example, a single 1 GB wide
partition vs. ten 100 MB partitions (same partition key plus a 0-9 bucket number) vs. a hundred
10 MB partitions (0-99 bucket number), for two access patterns: 1) random access a row or
short slice, and 2) a full bulk read of the 1 GB of data, one moderate slice at a time.

Or maybe the question is equivalent to asking what the cost is to access the last row of the
1 GB partition vs. the last row of the tenth or hundredth bucket of the bucketed equivalent.

No precision required. Just inquiring whether we can get rid of bucketing as a preferred data
modeling strategy, at least for the common use cases where the sum of the buckets is roughly
2 GB or less..

The bucketing approach does have the side effect of distributing the buckets around the cluster,
which could be a good thing, or maybe not.

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>                 Key: CASSANDRA-9754
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects are IndexInfo
and its ByteBuffers. This is specially bad in endpoints with large CQL partitions. If a CQL
partition is say 6,4GB, it will have 100K IndexInfo objects and 200K ByteBuffers. This will
create a lot of churn for GC. Can this be improved by not creating so many objects?

This message was sent by Atlassian JIRA

View raw message