cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Kjellman (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions
Date Wed, 31 Aug 2016 04:23:20 GMT


Michael Kjellman commented on CASSANDRA-9754:

I pushed a rebased commit that addresses many additional comments by [~jasobrown] from review,
adds additional unit tests, and has many further improvements to documentation. This is still
2.1 based, however the review and improvements made in the org.apache.cassandra.db.index.birch
package is agnostic to a trunk or 2.1 based patch.

Some Highlights:
 * Fix a bug in KeyIterator identified by [~jjirsa] that would cause the iterator to return
nothing when the backing SegmentedFile contains exactly 1 key/segment.
 * Add unit tests for KeyIterator
 * Add SSTable version ka support to LegacySSTableTest. Actually test something in LegacySSTableTest.
 * Add additional unit tests around PageAlignedReader, PageAlignedWriter, BirchWriter, and
 * Remove word lists and refactor all unit tests to use TimeUUIDTreeSerializableIterator instead
 * Improve documentation and fix documentation as required to properly parse and format during
javadoc creation
 * Remove reset() functionality from BirchReader.BirchIterator
 * Fix many other nits

> Make index info heap friendly for large CQL partitions
> ------------------------------------------------------
>                 Key: CASSANDRA-9754
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Assignee: Michael Kjellman
>            Priority: Minor
>             Fix For: 4.x
>         Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff
>  Looking at a heap dump of 2.0 cluster, I found that majority of the objects are IndexInfo
and its ByteBuffers. This is specially bad in endpoints with large CQL partitions. If a CQL
partition is say 6,4GB, it will have 100K IndexInfo objects and 200K ByteBuffers. This will
create a lot of churn for GC. Can this be improved by not creating so many objects?

This message was sent by Atlassian JIRA

View raw message