cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From horschi <>
Subject Re: Possible optimization: avoid creating tombstones for TTLed columns if updates to TTLs are disallowed
Date Tue, 28 Jan 2014 17:04:28 GMT
Hi Donald,

I was reporting the ticket you mentioned, so I kinds feel like I should
answer this :-)

 I presume the point is that GCable tombstones can still do work
> (preventing spurious writing from nodes that were down) but only until the
> data is flushed to disk.
I am not sure I understand this correctly. Could you rephrase that sentence?

> If the effective TTL exceeds gc_grace_seconds then the tombstone will be
> deleted anyway.
Its not even written (since  CASSANDRA-4917). There is no delete on the
tombstone in that case.

>  It occurred to me that if you never update the TTL of a column, then
> there should be no need for tombstones at all:  any replicas will have the
> same TTL.  So there'd be no risk of missed deletes.  You wouldn't even need
> GCable tombstones
I think so too. There should be no need for a tombstone at all if the
following condition are given:
- column was not deleted manually, but timed out by itself
- column was not updated in the last gc_grace days

If I am not mistaken, the second point would even be neccessary for
CASSANDRA-4917 to be able to handle changing TTLs correctly: I think the
current implementation might break, if a column gets updated with a smaller
TTL, or to be more precise when  (old.creationdate + old.ttl) <
(new.creationdate + new.ttl) && new.ttl < gc_grace

Imho, for any further tombstone-optimization to work, compaction would have
to be smarter:
 I think it should be able to track max(old.creationdate + old.ttl ,
new.creationdate + new.ttl) when merging columns. I have no idea if that is
possible though.

> So, if - and it's a big if - a table disallowed updates to TTL, then you
> could really optimize deletion of TTLed columns: you could do away with
> tombstones entirely.   If a table allows updates to TTL then it's possible
> a different node will have the row without the TTL and the tombstone would
> be needed.
I am not sure I understand this. My "thrift" understanding of cassandra is
that you cannot update the TTL, you can just update an entire column. Also
each column has its own TTL. There is no TTL on the row.


View raw message