cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3855) RemoveDeleted dominates compaction time for large sstable counts
Date Fri, 20 Jul 2012 08:31:35 GMT


Sylvain Lebresne commented on CASSANDRA-3855:

Agreed that it is wrong, but I think that it's more than the first line that is wrong. I think
that method should be:
public boolean hasIrrelevantData(int gcBefore)
    if (deletionInfo().isLive())
        return false;

    // Do we have gcable deletion infos?
    if (!deletionInfo().purge(gcbefore).equals(deletionInfo()))
        return true;

    // Do we have colums that are either deleted by the container or gcable tombstone?
    for (IColumn column : columns)
        if (deletionInfo().isDeleteted(column) || column.hasIrrelevantData(gcBefore))
            return true;

    return false;
> RemoveDeleted dominates compaction time for large sstable counts
> ----------------------------------------------------------------
>                 Key: CASSANDRA-3855
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.0
>            Reporter: Stu Hood
>            Assignee: Yuki Morishita
>              Labels: compaction, deletes, leveled
>         Attachments: with-cleaning-java.hprof.txt
> With very large numbers of sstables (2000+ generated by a `bin/stress -n 100,000,000`
run with LeveledCompactionStrategy), PrecompactedRow.removeDeletedAndOldShards dominates compaction
runtime, such that commenting it out takes compaction throughput from 200KB/s to 12MB/s.
> Stack attached.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message