cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: Mass deletion -- slowing down
Date Mon, 14 Nov 2011 00:22:13 GMT
Deletions in Cassandra imply the use of tombstones (see and under some
circumstances reads can turn O(n) with respect to the amount of
columns deleted, depending. It sounds like this is what you're seeing.

For example, suppose you're inserting a range of columns into a row,
deleting it, and inserting another non-overlapping subsequent range.
Repeat that a bunch of times. In terms of what's stored in Cassandra
for the row you now have:

   actual data

If you then do something like a slice on that row with the end-points
being such that they include all the tombstones, Cassandra essentially
has to read through and process all those tombstones (for the
PostgreSQL aware: this is similar to the effect you can get if
implementing e.g. a FIFO queue, where MIN(pos) turns O(n) with respect
to the number of deleted entries until the last vacuum - improved in
modern versions)).

/ Peter Schuller (@scode,

View raw message