> Can method calculate nonoverlapping keys as overlapping?
Yes.
And randomized keys don't matter here since sstables are sorted by
"token" calculated from key by your partitioner, and the method uses
sstable's min/max token to estimate overlap.
On Tue, May 21, 2013 at 4:43 PM, cem <cayiroglu@gmail.com> wrote:
> Thank you very much for the swift answer.
>
> I have one more question about the second part. Can method calculate
> nonoverlapping keys as overlapping? I mean it uses max and min tokens and
> column count. They can be very close to each other if random keys are used.
>
> In my use case I generate a GUID for each key and send a single write
> request.
>
> Cem
>
> On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita <mor.yuki@gmail.com> wrote:
>>
>> > Why does Cassandra single table compaction skips the keys that are in
>> > the other sstables?
>>
>> because we don't want to resurrect deleted columns. Say, sstable A has
>> the column with timestamp 1, and sstable B has the same column which
>> deleted at timestamp 2. Then if we purge that column only from sstable
>> B, we would see the column with timestamp 1 again.
>>
>> > I also dont understand why we have this line in worthDroppingTombstones
>> > method
>>
>> What the method is trying to do is to "guess" how many columns that
>> are not in the rows that don't overlap, without actually going through
>> every rows in the sstable. We have statistics like column count
>> histogram, min and max row token for every sstables, we use those in
>> the method to estimate how many columns the two sstables overlap.
>> You may have remainingColumnsRatio of 0 when the two sstables overlap
>> almost entirely.
>>
>>
>> On Tue, May 21, 2013 at 3:43 PM, cem <cayiroglu@gmail.com> wrote:
>> > Hi all,
>> >
>> > I have a question about ticket
>> > https://issues.apache.org/jira/browse/CASSANDRA3442
>> >
>> > Why does Cassandra single table compaction skips the keys that are in
>> > the
>> > other sstables? Please correct if I am wrong.
>> >
>> > I also dont understand why we have this line in worthDroppingTombstones
>> > method:
>> >
>> > double remainingColumnsRatio = ((double) columns) /
>> > (sstable.getEstimatedColumnCount().count() *
>> > sstable.getEstimatedColumnCount().mean());
>> >
>> > remainingColumnsRatio is always 0 in my case and the droppableRatio is
>> > 0.9. Cassandra skips all sstables which are already expired.
>> >
>> > This line was introduced by
>> > https://issues.apache.org/jira/browse/CASSANDRA4022.
>> >
>> > Best Regards,
>> > Cem
>>
>>
>>
>> 
>> Yuki Morishita
>> t:yukim (http://twitter.com/yukim)
>
>

Yuki Morishita
t:yukim (http://twitter.com/yukim)
