incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuki Morishita <mor.y...@gmail.com>
Subject Re: Automatic tombstone removal issue (STCS)
Date Wed, 07 May 2014 00:32:17 GMT
Hi Paulo,

The reason we check overlap is not to resurrect deleted data by only
dropping tombstone marker from single SSTable.
And we don't want to check row by row to determine if SSTable is
droppable since it takes time, so we use token ranges to determine if
it MAY have droppable columns.

On Tue, May 6, 2014 at 7:14 PM, Paulo Ricardo Motta Gomes
<paulo.motta@chaordicsystems.com> wrote:
> Hello,
>
> Sorry for being persistent, but I'd love to clear my understanding on this.
> Has anyone seen single sstable compaction being triggered for STCS sstables
> with high tombstone ratio?
>
> Because if the above understanding is correct, the current implementation
> almost never triggers this kind of compaction, since the token ranges of a
> node's sstable almost always overlap. Could this be a bug or is it expected
> behavior?
>
> Thank you,
>
>
>
> On Mon, May 5, 2014 at 8:59 AM, Paulo Ricardo Motta Gomes
> <paulo.motta@chaordicsystems.com> wrote:
>>
>> Hello,
>>
>> After noticing that automatic tombstone removal (CASSANDRA-3442) was not
>> working in an append-only STCS CF with 40% of droppable tombstone ratio I
>> investigated why the compaction was not being triggered in the largest
>> SSTable with 16GB and about 70% droppable tombstone ratio.
>>
>> When the code goes to check if the SSTable is candidate to be compacted
>> (AbstractCompactionStrategy.worthDroppingTombstones), it verifies if all the
>> others SSTables overlap with the current SSTable by checking if the start
>> and end tokens overlap. The problem is that all SSTables contain pretty much
>> the whole node token range, so all of them overlap nearly all the time, so
>> the automatic tombstone removal never happens. Is there any case in STCS
>> where all sstables token ranges DO NOT overlap?
>>
>> I understand during the tombstone removal process it's necessary to verify
>> if the compacted row exists in any other SSTable, but I don't understand why
>> it's necessary to verify if the token ranges overlap to decide if a
>> tombstone compaction must be executed on a single SSTable with high
>> droppable tombstone ratio.
>>
>> Any clarification would be kindly appreciated.
>>
>> PS: Cassandra version: 1.2.16
>>
>> --
>> Paulo Motta
>>
>> Chaordic | Platform
>> www.chaordic.com.br
>> +55 48 3232.3200
>
>
>
>
> --
> Paulo Motta
>
> Chaordic | Platform
> www.chaordic.com.br
> +55 48 3232.3200



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)

Mime
View raw message