cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Letz (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8546) RangeTombstoneList becoming bottleneck on tombstone heavy tasks
Date Mon, 05 Jan 2015 11:58:35 GMT


Dominic Letz commented on CASSANDRA-8546:

Sorry for the assignment, Tyler mentioned you to look into it but did assign back to me. Since
he didn't mention anything for me to be done I assumed that was an error and he meant to assign
to you, as he mentioned you.

Good call on BTree vs. GapList indeed the BTree does guarantee O(log N) but you will get in
all cases. While GapList gives O(1) on front and back insert, and O(1) for middle insert with
the same locality - and insertFrom() which is exactly doing that on overlapping ranges.

The true reason I went for GapList though was that it allowed me to keep most of the code
unchanged as it also supports index based access. Now though I think I know the code good
enough to replace it with TreeMap or similar.

Let me know whether such a changed patch would be helpful.

> RangeTombstoneList becoming bottleneck on tombstone heavy tasks
> ---------------------------------------------------------------
>                 Key: CASSANDRA-8546
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: 2.0.11 / 2.1
>            Reporter: Dominic Letz
>            Assignee: Benedict
>             Fix For: 2.1.3
>         Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, tombstone_test.tgz
> I would like to propose a change of the data structure used in the RangeTombstoneList
to store and insert tombstone ranges to something with at least O(log N) insert in the middle
and at near O(1) and start AND end. Here is why:
> When having tombstone heavy work-loads the current implementation of RangeTombstoneList
becomes a bottleneck with slice queries.
> Scanning the number of tombstones up to the default maximum (100k) can take up to 3 minutes
of how addInternal() scales on insertion of middle and start elements.
> The attached test shows that with 50k deletes from both sides of a range.
> INSERT 1...110000
> flush()
> DELETE 1...50000
> DELETE 110000...60000
> While one direction performs ok (~400ms on my notebook):
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
> {code}
> The other direction underperforms (~7seconds on my notebook)
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
> {code}

This message was sent by Atlassian JIRA

View raw message