cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush
Date Wed, 12 Sep 2012 08:50:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453838#comment-13453838
] 

Sylvain Lebresne commented on CASSANDRA-4565:
---------------------------------------------

bq. Do you think expired ttl columns should be replaced with tombstones at memtable flush?

No, I'm even pretty sure it would be a bad idea. Currently the code does two iterations over
a row to flush it: first it computes the row serialized size (to write that at the beginning
of the row), then it actually writes it. We should *not* transform expired columns to tombstone
during the 2nd iteration because it would screw up the serialized size computation. And the
first iteration is just ill suited too because doing that transformation in the serializedSize()
method would be a big hack. So we would need to do an iteration just for that purpose, and
given that having expired column during flush is a corner case, it would cost more than it
would give us.

If we remove the row serialized size (and column count) in the sstable format (which we may
at some point), then we can revisit as it will be trivial then.
                
> TTL columns with older then gcgrace do not need to flush
> --------------------------------------------------------
>
>                 Key: CASSANDRA-4565
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Edward Capriolo
>            Assignee: Aleksey Yeschenko
>             Fix For: 1.3
>
>         Attachments: cassandra-4565.patch.1.txt
>
>
> With memcache many people are willing to sacrifice durability for performance. Cassandra
has a TimeToLive feature that can be used in caching scenarios with low values for gc_grace_seconds.
However from a code dive it seems that cassandra will always write TTL to disk, even those
that are beyond gc_grace_seconds. If a use case very large memtables,small ttl, and small
gc_grace it is possible that flushing these columns to disk can be skipped entirely in some
scenarios. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message