incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuki Morishita <mor.y...@gmail.com>
Subject Re: Cassandra 1.2 TTL histogram problem
Date Tue, 21 May 2013 21:13:50 GMT
> Why does Cassandra single table compaction skips the keys that are in the other sstables?

because we don't want to resurrect deleted columns. Say, sstable A has
the column with timestamp 1, and sstable B has the same column which
deleted at timestamp 2. Then if we purge that column only from sstable
B, we would see the column with timestamp 1 again.

> I also dont understand why we have this line in worthDroppingTombstones method

What the method is trying to do is to "guess" how many columns that
are not in the rows that don't overlap, without actually going through
every rows in the sstable. We have statistics like column count
histogram, min and max row token for every sstables, we use those in
the method to estimate how many columns the two sstables overlap.
You may have remainingColumnsRatio of 0 when the two sstables overlap
almost entirely.


On Tue, May 21, 2013 at 3:43 PM, cem <cayiroglu@gmail.com> wrote:
> Hi all,
>
> I have a question about ticket
> https://issues.apache.org/jira/browse/CASSANDRA-3442
>
> Why does Cassandra single table compaction skips the keys that are in the
> other sstables? Please correct if I am wrong.
>
> I also dont understand why we have this line in worthDroppingTombstones
> method:
>
> double remainingColumnsRatio = ((double) columns) /
> (sstable.getEstimatedColumnCount().count() *
> sstable.getEstimatedColumnCount().mean());
>
> remainingColumnsRatio  is always 0 in my case and the droppableRatio  is
> 0.9. Cassandra skips all sstables which are already expired.
>
> This line was introduced by
> https://issues.apache.org/jira/browse/CASSANDRA-4022.
>
> Best Regards,
> Cem



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)

Mime
View raw message