From my experience, levelled compaction makes space reclamation after deletes even less predictable than sized-tier.
On 2012-11-08, at 1:12 PM, B. Todd Burruss <email@example.com> wrote:
> we are having the problem where we have huge SSTABLEs with tombstoned data in them that is not being compacted soon enough (because size tiered compaction requires, by default, 4 like sized SSTABLEs). this is using more disk space than we anticipated.
> we are very write heavy compared to reads, and we delete the data after N number of days (depends on the column family, but N is around 7 days)
> my question is would leveled compaction help to get rid of the tombstoned data faster than size tiered, and therefore reduce the disk space usage
The reason is that deletes, like all mutations, are just recorded into sstables. They enter level0, and get slowly, over time, promoted upwards to levelN.
Depending on your *total* mutation volume VS your data set size, this may be quite a slow process. This is made even worse if the size of the data you're deleting (say, an entire row worth several hundred kilobytes) is to-be-deleted by a small row-level tombstone. If the row is sitting in level 4, the tombstone won't impact it until enough data has pushed over all existing data in level3, level2, level1, level0
Finally, to guard against the tombstone missing any data, the tombstone itself is not candidate for removal (I believe even after gc_grace has passed) unless it's reached the highest populated level in levelled compaction. This means if you have 4 levels and issue a ton of deletes (even deletes that will never impact existing data), these tombstones are deadweight that cannot be purged until they hit level4.
For a write-heavy workload, I recommend you stick with sized-tier. You have several options at your disposal (compaction min/max thresholds, gc_grace) to move things along. If that doesn't help, I've heard of some fairly reputable people doing some fairly blasphemous things (major compactions every night).