cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabien Rousseau (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-5677) Performance improvements of RangeTombstones/IntervalTree
Date Thu, 04 Jul 2013 14:33:49 GMT


Fabien Rousseau commented on CASSANDRA-5677:

I had a quick look at the patch (I'll take more time to review it next week), and being faster
is really great. Keeping only the latest range tombstone (In fact, in our use case, we often
overwrite range tombstones) was something I also add in mind but kept it for later optimization
: I wrongly assumed that they were kept for a real reason (like repair for example).

I definitely think your approach is better and performance numbers confirms it.
I really think that this patch should be available for 1.2 (either as a patch to apply, or
directly in 1.2.X).
If in 1.2.X, an option could be added in cassandra.yaml file to switch of implementations
(just rapidly checked and seen that ondisk format seems compatible...)
In 1.2.X : make the current implementation the default to avoid introducing too many changes
(but users having performance trouble could still switch after doing some tests. Also note
that it should be possible to log something if more than X range tombstones are read and that
it is advised to switch of implementation...)
In 2.0 : make the RangeTombstoneList the default
(It is just raw ideas...)

I can try to rebase your patch for 1.2 next week if you're interested...
> Performance improvements of RangeTombstones/IntervalTree
> --------------------------------------------------------
>                 Key: CASSANDRA-5677
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Fabien Rousseau
>            Priority: Minor
>         Attachments: 5677-new-IntervalTree-implementation.patch
> Using massively range tombstones leads to bad response time (ie 100-500 ranges tombstones
per row).
> After investigation, it seems that the culprit is how the DeletionInfo are merged. Each
time a RangeTombstone is added into the DeletionInfo, the whole IntervalTree is rebuilt (thus,
if you have 100 tombstones in one row, then 100 instances of IntervalTree are created, the
first one having one interval, the second one 2 intervals,... the 100th one : 100 intervals...)
> It seems that once the IntervalTree is built, it is not possible to add a new Interval.
Idea is to change the implementation of the IntervalTree by another one which support "insert
> Attached is a proposed patch which :
>  - renames the IntervalTree implementation to IntervalTreeCentered (the renaming is inspired
from :
>  - adds a new implementation IntervalTreeAvl (which is described here :
and here : )
>  - adds a new interface IIntervalTree to abstract the implementation
>  - adds a new configuration option (interval_tree_provider) which allows to choose between
the two implementations (defaults to previous IntervalTreeCentered)
>  - updates IntervalTreeTest unit tests to test both implementations
>  - creates a mini benchmark between the two implementations (tree creation, point lookup,
interval lookup)
>  - creates a mini benchmark between the two implementations when merging DeletionInfo
(which shows a big performance improvement when using 500 tombstones for a row)
> This patch applies for 1.2 branch...

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message