cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Rolo <r...@pythian.com>
Subject Re: Global TTL vs Insert TTL
Date Wed, 01 Feb 2017 20:46:43 GMT
Awsome to know this!

Thanks Jon and DuyHai!

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Wed, Feb 1, 2017 at 6:57 PM, Jonathan Haddad <jon@jonhaddad.com> wrote:

> The optimization is there.  The entire sstable can be dropped but it's not
> because of the default TTL.  The default TTL only applies if a TTL isn't
> specified explicitly.  The default TTL can't be used to drop a table
> automatically since it can be overridden at insert time.  Check out this
> example.  The first insert uses the default TTL.  The second insert
> overrides the default.  Using the default TTL to drop the sstable would be
> pretty terrible in this case:
>
> CREATE TABLE test.b (
>     k int PRIMARY KEY,
>     v int
> ) WITH default_time_to_live = 10000;
>
> insert into b (k, v) values (1, 1);
> cqlsh:test> select k, v, TTL(v) from b  where k = 1;
>
>  k | v | ttl(v)
> ---+---+--------
>  1 | 1 |   9943
>
> (1 rows)
>
> cqlsh:test> insert into b (k, v) values (2, 1) USING TTL 99999999;
> cqlsh:test> select k, v, TTL(v) from b  where k = 2;
>
>  k | v | ttl(v)
> ---+---+----------
>  2 | 1 | 99999995
>
> (1 rows)
>
> TL;DR: The default TTL is there as a convenience so you don't have to keep
> the TTL in your code.  From a performance perspective it does not matter.
>
> Jon
>
>
> On Wed, Feb 1, 2017 at 10:39 AM DuyHai Doan <doanduyhai@gmail.com> wrote:
>
>> I was referring to this JIRA https://issues.apache.
>> org/jira/browse/CASSANDRA-3974 when talking about dropping entire
>> SSTable at compaction time
>>
>> But the JIRA is pretty old and it is very possible that the optimization
>> is no longer there
>>
>>
>>
>> On Wed, Feb 1, 2017 at 6:53 PM, Jonathan Haddad <jon@jonhaddad.com>
>> wrote:
>>
>> This is incorrect, there's no optimization used that references the table
>> level TTL setting.   The max local deletion time is stored in table
>> metadata.  See org.apache.cassandra.io.sstable.metadata.StatsMetadata#maxLocalDeletionTime
>> in the Cassandra 3.0 branch.    The default ttl is stored
>> here: org.apache.cassandra.schema.TableParams#defaultTimeToLive and is
>> never referenced during compaction.
>>
>> Here's an example from a table I created without a default TTL, you can
>> use the sstablemetadata tool to see:
>>
>> jhaddad@rustyrazorblade ~/dev/cassandra/data/data/test$
>> ../../../tools/bin/sstablemetadata a-7bca6b50e8a511e6869a5596edf4dd
>> 35/mc-1-big-Data.db
>> .....
>> SSTable max local deletion time: 1485980862
>>
>> On Wed, Feb 1, 2017 at 6:59 AM DuyHai Doan <doanduyhai@gmail.com> wrote:
>>
>> Global TTL is better than dynamic runtime TTL
>>
>> Why ?
>>
>>  Because Global TTL is a table property and Cassandra can perform
>> optimization when compacting.
>>
>> For example if it can see than the maxTimestamp of an SSTable is older
>> than the table Global TTL, the SSTable can be entirely dropped during
>> compaction
>>
>> Using dynamic TTL at runtime, since Cassandra doesn't how and cannot
>> track each individual TTL value, the previous optimization is not possible
>> (even if you always use the SAME TTL for all query, Cassandra is not
>> supposed to know that)
>>
>>
>>
>> On Wed, Feb 1, 2017 at 3:01 PM, Cogumelos Maravilha <
>> cogumelosmaravilha@sapo.pt> wrote:
>>
>> Thank you all, for your answers.
>>
>> On 02/01/2017 01:06 PM, Carlos Rolo wrote:
>>
>> To reinforce Alain statement:
>>
>> "I would say that the unsafe part is more about using C* 3.9" this is
>> key. You would be better on 3.0.x unless you need features on the 3.x
>> series.
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
>> *linkedin.com/in/carlosjuzarterolo
>> <http://linkedin.com/in/carlosjuzarterolo>*
>> Mobile: +351 918 918 100 <+351%20918%20918%20100>
>> www.pythian.com
>>
>> On Wed, Feb 1, 2017 at 8:32 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>> wrote:
>>
>> Is it safe to use TWCS in C* 3.9?
>>
>>
>> I would say that the unsafe part is more about using C* 3.9 than using
>> TWCS in C*3.9 :-). I see no reason to say 3.9 would be specifically unsafe
>> in C*3.9, but I might be missing something.
>>
>> Going from STCS to TWCS is often smooth, from LCS you might expect an
>> extra load compacting a lot (all?) of the SSTable from what we saw from the
>> field. In this case, be sure that your compaction options are safe enough
>> to handle this.
>>
>> TWCS is even easier to use on C*3.0.8+ and C*3.8+ as it became the new
>> default replacing TWCS, so no extra jar is needed, you can enable TWCS as
>> any other default compaction strategy.
>>
>> C*heers,
>> -----------------------
>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2017-01-31 23:29 GMT+01:00 Cogumelos Maravilha <
>> cogumelosmaravilha@sapo.pt>:
>>
>> Hi Alain,
>>
>> Thanks for your response and the links.
>>
>> I've also checked "Time series data model and tombstones".
>>
>> Is it safe to use TWCS in C* 3.9?
>>
>> Thanks in advance.
>>
>> On 31-01-2017 11:27, Alain RODRIGUEZ wrote:
>>
>> Is there a overhead using line by line option or wasted disk space?
>>
>>  There is a very recent topic about that in the mailing list, look for "Time
>> series data model and tombstones". I believe DuyHai answer your question
>> there with more details :).
>>
>> *tl;dr:*
>>
>> Yes, if you know the TTL in advance, and it is fixed, you might want to
>> go with the table option instead of adding the TTL in each insert. Also you
>> might want consider using TWCS compaction strategy.
>>
>> Here are some blogposts my coworkers recently wrote about TWCS, it might
>> be useful:
>>
>> http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html
>> http://thelastpickle.com/blog/2017/01/10/twcs-part2.html
>>
>> C*heers,
>> -----------------------
>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>>
>>
>> 2017-01-31 10:43 GMT+01:00 Cogumelos Maravilha <
>> cogumelosmaravilha@sapo.pt>:
>>
>> Hi I'm just wondering what option is fastest:
>>
>> Global:*create table xxx (.....**AND **default_time_to_live = **XXX**;**
>> and**UPDATE xxx USING TTL XXX;*
>>
>> Line by line:
>> *INSERT INTO xxx (...** USING TTL xxx;*
>>
>> Is there a overhead using line by line option or wasted disk space?
>>
>> Thanks in advance.
>>
>>
>>
>>
>>
>>
>>

-- 


--




Mime
View raw message