cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8546) RangeTombstoneList becoming bottleneck on tombstone heavy tasks
Date Mon, 05 Jan 2015 13:08:36 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264565#comment-14264565
] 

Benedict commented on CASSANDRA-8546:
-------------------------------------

If CASSANDRA-8099 fixes it somehow, that's sufficient in my book. 

A BTree, for the record, can happily achieve amortized constant time insertion at the end,
if so desired. So long as your depth does not exceed the fan_factor (which is pretty much
impossible here). 

The correct data structure here is most likely a BTree, a Trie, or a hybrid such as a String
B-Tree. Not specifically for optimising this case, but in general. For 3.1 or 3.2, assuming
no other impediments appear, I expect this all to be replaced along with (or shortly after)
the updated storage format.

> RangeTombstoneList becoming bottleneck on tombstone heavy tasks
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-8546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8546
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: 2.0.11 / 2.1
>            Reporter: Dominic Letz
>            Assignee: Benedict
>             Fix For: 2.1.3
>
>         Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, tombstone_test.tgz
>
>
> I would like to propose a change of the data structure used in the RangeTombstoneList
to store and insert tombstone ranges to something with at least O(log N) insert in the middle
and at near O(1) and start AND end. Here is why:
> When having tombstone heavy work-loads the current implementation of RangeTombstoneList
becomes a bottleneck with slice queries.
> Scanning the number of tombstones up to the default maximum (100k) can take up to 3 minutes
of how addInternal() scales on insertion of middle and start elements.
> The attached test shows that with 50k deletes from both sides of a range.
> INSERT 1...110000
> flush()
> DELETE 1...50000
> DELETE 110000...60000
> While one direction performs ok (~400ms on my notebook):
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
> {code}
> The other direction underperforms (~7seconds on my notebook)
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message