cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6271) Replace SnapTree in AtomicSortedColumns
Date Tue, 31 Dec 2013 17:24:51 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859581#comment-13859581
] 

Benedict commented on CASSANDRA-6271:
-------------------------------------

Ah, and I just realised I made a major mistake in my parent-pointer approach, which further
damages the idea, probably a show stopper:

We can't re-use *any* sub-trees if we use parent pointers when modifying a btree, as the old
btree needs to retain its prior parent pointers. So a single column modification will need
to rewrite the entire btree. We could reduce the impact of this by allocating wrapper objects
that just contain the array and the parent pointer, but then we're getting into larger memory
costs and more dereference costs when traversing. Also, further polluting the code that navigates
with dealing with this extra level of indirection. Now that I think about it, I think this
is why I ruled them out in the first place.

> Replace SnapTree in AtomicSortedColumns
> ---------------------------------------
>
>                 Key: CASSANDRA-6271
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6271
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>         Attachments: oprate.svg
>
>
> On the write path a huge percentage of time is spent in GC (>50% in my tests, if accounting
for slow down due to parallel marking). SnapTrees are both GC unfriendly due to their structure
and also very expensive to keep around - each column name in AtomicSortedColumns uses >
100 bytes on average (excluding the actual ByteBuffer).
> I suggest using a sorted array; changes are supplied at-once, as opposed to one at a
time, and if < 10% of the keys in the array change (and data equal to < 10% of the size
of the key array) we simply overlay a new array of changes only over the top. Otherwise we
rewrite the array. This method should ensure much less GC overhead, and also save approximately
80% of the current memory overhead.
> TreeMap is similarly difficult object for the GC, and a related task might be to remove
it where not strictly necessary, even though we don't keep them hanging around for long. TreeMapBackedSortedColumns,
for instance, seems to be used in a lot of places where we could simply sort the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message