cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hannu Kröger <hkro...@gmail.com>
Subject TWCS on partitions spanning multiple time windows
Date Thu, 14 Dec 2017 16:37:46 GMT
Hi,

I have been reading a bit about TWCS to understand how it functions.

Current assumption: TWCS uses same tombstone checks as any other compaction
strategy to make sure that it doesn’t remove tombstones unless it is safe
to do so.

Scenario 1:

So let’s assume that I have a tables like this:

CREATE TABLE twcs.twcs (
    user_id int,
    id int,
    value int,
    text_value text,
    PRIMARY KEY (user_id, id, value)
)

I insert data for multiple users but also multiple events per user with TTL
of 10 days and have time window set to 1 day.

Basically when 10 days are up the first sstable contains just TTL'd data.
However TWCS cannot just drop sstables because same partition exists in
sstables of other windows. Otherwise it wouldn’t be safe, right?

Scenario 2:

Table is as follows

CREATE TABLE twcs.twcs2 (
    user_id int,
    day int,
    id int,
    value int,
    text_value text,
    PRIMARY KEY ((user_id, day), id, value)
)

I insert data with TTL of 10 days and have time window set to 2 days.

Basically when 10 days are up the first sstable contains just TTL:d data.
In this case TWCS can drop the whole sstable because the whole partition is
in the same time window and same sstable. Correct?

If we for some reason have time window and partition time buckets
misaligned, e.g. time window is 25 hours and time bucket is 24 hours, then
we end up in situation where we will actually never get rid of all
tombstones because same partition data will be across multiple time windows
which won’t be compacted together. So we would be in trouble, right?

Did I get it right?

Hannu

Mime
View raw message