cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (Updated) (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-3442) TTL histogram for sstable metadata
Date Tue, 06 Mar 2012 19:09:58 GMT


Yuki Morishita updated CASSANDRA-3442:

    Attachment: 3442-track-tombstones.txt

Patch attached to track tombstones by creating its drop time histogram.
Size tiered compaction strategy uses this to calculate fraction of droppable tombstones at
compaction and perform single sstable compaction if the fraction exceeds threshold.

Note that original patch overcounted ExpiringColumn inside SuperColumn. Overall column count
is done at SuperColumn level, so tombstone count should be done at the same level. Newer patch
counts tombstones by simply checking its local deletion time < Integer.MAX_VALUE.

I also rewrite unit test to simply create one sstable with tombstones and let it get compacted.
> TTL histogram for sstable metadata
> ----------------------------------
>                 Key: CASSANDRA-3442
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>              Labels: compaction
>             Fix For: 1.2
>         Attachments: 3442-track-tombstones.txt, 3442-v3.txt, cassandra-1.1-3442.txt
> Under size-tiered compaction, you can generate large sstables that compact infrequently.
 With expiring columns mixed in, we could waste a lot of space in this situation.
> If we kept a TTL EstimatedHistogram in the sstable metadata, we could do a single-sstable
compaction aginst sstables with over 20% (?) expired data.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message