incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Smith <>
Subject Re: Does a scrub remove deleted/expired columns?
Date Fri, 14 Dec 2012 03:45:00 GMT
Thanks for the great explanation.

I'd just like some clarification on the last point. Is it the case that if
I constantly add new columns to a row, while periodically trimming the row
by by deleting the oldest columns, the deleted columns won't get cleaned up
until all fragments of the row exist in a single sstable and that sstable
undergoes a compaction?

If my understanding is correct, do you know if 1.2 will enable cleanup of
columns in rows that have scattered fragments? Or, should I take a
different approach?

On Thu, Dec 13, 2012 at 5:52 PM, aaron morton <>wrote:

>  Is it possible to use scrub to accelerate the clean up of expired/deleted
> data?
> No.
> Scrub, and upgradesstables, are used to re-write each file on disk. Scrub
> may remove some rows from a file because of corruption, however
> upgradesstables will not.
> If you have long lived rows and a mixed work load of writes and deletes
> there are a couple of options.
> You can try levelled compaction
> You can tune the default sized tiered compaction by increasing the
> min_compaction_threshold. This will increase the number of files that must
> exist in each size tier before it will be compacted. As a result the speed
> at which rows move into the higher tiers will slow down.
> Note that having lots of files may have a negative impact on read
> performance. You can measure this my looking at the SSTables per read
> metric in the cfhistograms.
> Lastly you can run a user defined or major compaction. User defined
> compaction is available via JMX and allows you to compact any file you
> want. Manual / major compaction is available via node tool. We usually
> discourage it's use as it will create one big file that will not get
> compacted for a while.
> For background the tombstones / expired columns for a row are only purged
> from the database when all fragments of the row are  in the files been
> compacted. So if you have an old row that is spread out over many files it
> may not get purged.
> Hope that helps.
>    -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> @aaronmorton
> On 14/12/2012, at 3:01 AM, Mike Smith <> wrote:
> I'm using 1.0.12 and I find that large sstables tend to get compacted
> infrequently. I've got data that gets deleted or expired frequently. Is it
> possible to use scrub to accelerate the clean up of expired/deleted data?
> --
> Mike Smith
> Director Development, MailChannels

Mike Smith
Director Development, MailChannels

View raw message