incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuki Morishita <mor.y...@gmail.com>
Subject Re: Cassandra 1.2 TTL histogram problem
Date Wed, 22 May 2013 15:58:33 GMT
> Can method calculate non-overlapping keys as overlapping?

Yes.
And randomized keys don't matter here since sstables are sorted by
"token" calculated from key by your partitioner, and the method uses
sstable's min/max token to estimate overlap.

On Tue, May 21, 2013 at 4:43 PM, cem <cayiroglu@gmail.com> wrote:
> Thank you very much for the swift answer.
>
> I have one more question about the second part. Can method calculate
> non-overlapping keys as overlapping? I mean it uses max and min tokens and
> column count. They can be very close to each other if random keys are used.
>
> In my use case I generate a GUID for each key and send a single write
> request.
>
> Cem
>
> On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita <mor.yuki@gmail.com> wrote:
>>
>> > Why does Cassandra single table compaction skips the keys that are in
>> > the other sstables?
>>
>> because we don't want to resurrect deleted columns. Say, sstable A has
>> the column with timestamp 1, and sstable B has the same column which
>> deleted at timestamp 2. Then if we purge that column only from sstable
>> B, we would see the column with timestamp 1 again.
>>
>> > I also dont understand why we have this line in worthDroppingTombstones
>> > method
>>
>> What the method is trying to do is to "guess" how many columns that
>> are not in the rows that don't overlap, without actually going through
>> every rows in the sstable. We have statistics like column count
>> histogram, min and max row token for every sstables, we use those in
>> the method to estimate how many columns the two sstables overlap.
>> You may have remainingColumnsRatio of 0 when the two sstables overlap
>> almost entirely.
>>
>>
>> On Tue, May 21, 2013 at 3:43 PM, cem <cayiroglu@gmail.com> wrote:
>> > Hi all,
>> >
>> > I have a question about ticket
>> > https://issues.apache.org/jira/browse/CASSANDRA-3442
>> >
>> > Why does Cassandra single table compaction skips the keys that are in
>> > the
>> > other sstables? Please correct if I am wrong.
>> >
>> > I also dont understand why we have this line in worthDroppingTombstones
>> > method:
>> >
>> > double remainingColumnsRatio = ((double) columns) /
>> > (sstable.getEstimatedColumnCount().count() *
>> > sstable.getEstimatedColumnCount().mean());
>> >
>> > remainingColumnsRatio  is always 0 in my case and the droppableRatio  is
>> > 0.9. Cassandra skips all sstables which are already expired.
>> >
>> > This line was introduced by
>> > https://issues.apache.org/jira/browse/CASSANDRA-4022.
>> >
>> > Best Regards,
>> > Cem
>>
>>
>>
>> --
>> Yuki Morishita
>>  t:yukim (http://twitter.com/yukim)
>
>



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)

Mime
View raw message