cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Some questions to updating and tombstone
Date Wed, 16 Nov 2016 15:26:11 GMT
Hi Boying,

Old value is not tombstone, but remains until compaction


Be careful, the above is generally true but not necessary.

Tombstones can actually be generated while using update in some corner
cases. Using collections or prepared statements.

I wrote a detailed blog post about deletes and tombstones in Cassandra
precisely to avoid answering this kind of question again and again on the
mailing list, as explaining correctly is a bit hard and I am a lazy guy. I
also talked about it at the last Cassandra summit. If you are going to use
Cassandra (and deletes) I think one of these might be of interest to you:

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
https://www.youtube.com/watch?v=lReTEcnzl7Y

If you still have questions after reading it, I would be very pleased to
help you further, but I believe this should be helpful.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2016-11-16 10:15 GMT+01:00 Shalom Sagges <shaloms@liveperson.com>:

> Hi Fabrice,
>
> Just a small (out of the topic) question I couldn't find an answer to.
> What is a slice in Cassandra? (e.g. Maximum tombstones per slice)
>
> Thanks!
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> <http://www.linkedin.com/company/164748> <http://twitter.com/liveperson>
> <http://www.facebook.com/LivePersonInc> We Create Meaningful Connections
>
> <https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>
>
>
> On Tue, Nov 15, 2016 at 6:38 PM, Fabrice Facorat <
> fabrice.facorat@gmail.com> wrote:
>
>> If you don't want tombstones, don't generate them ;)
>>
>> More seriously, tombstones are generated when:
>> - doing a DELETE
>> - TTL expiration
>> - set a column to NULL
>>
>> However tombstones are an issue only if for the same value, you have many
>> tombstones (i.e you keep overwriting the same values with datas and
>> tombstones). Having 1 tombstone for 1 value is not an issue, having 1000
>> tombstone for 1 value is a problem. Do really your use case overwrite data
>> with DELETE or  NULL ?
>>
>> So that's why what you may want to know is how many tombstones you have
>> on average when reading a value. This is available in:
>> - nodetool cfstats ks.cf : Average tombstones per slice/Maximum
>> tombstones per slice
>> - JMX : org.apache.cassandra.metrics:keyspace=<ks>,name=TombstoneSca
>> nnedHistogram,scope=<cf>,type=ColumnFamily Max/Count/99thPercentile/Mean
>>
>>
>> 2016-11-15 10:05 GMT+01:00 Lu, Boying <Boying.Lu@dell.com>:
>>
>>> Thanks a lot for your help.
>>>
>>>
>>>
>>> We are using STCS strategy and not using TTL
>>>
>>>
>>>
>>> Is there any API that we can use to query the current number of
>>> tombstones in a CF?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
>>> *Sent:* 2016年11月14日 22:20
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Some questions to updating and tombstone
>>>
>>>
>>>
>>> Hi Boying,
>>>
>>>
>>>
>>> I agree with Vladimir.If compaction is not compacting the two sstables
>>> with updates soon, disk space issues will be wasted. For example, if the
>>> updates are not closer in time, first update might be in a big table by the
>>> time second update is being written in a new small table. STCS wont compact
>>> them together soon.
>>>
>>>
>>>
>>> Just adding column values with new timestamp shouldnt create any
>>> tombstones. But if data is not merged for long, disk space issues may
>>> arise. If you are STCS,just  yo get an idea about the extent of the problem
>>> you can run major compaction and see the amount of disk space created with
>>> that( dont do this in production as major compaction has its own side
>>> effects).
>>>
>>>
>>>
>>> Which compaction strategy are you using?
>>>
>>> Are these updates done with TTL?
>>>
>>>
>>>
>>> Thanks
>>> Anuj
>>>
>>>
>>>
>>> On Mon, 14 Nov, 2016 at 1:54 PM, Vladimir Yudovin
>>>
>>> <vladyu@winguzone.com> wrote:
>>>
>>> Hi Boying,
>>>
>>>
>>>
>>> UPDATE write new value with new time stamp. Old value is not tombstone,
>>> but remains until compaction. gc_grace_period is not related to this.
>>>
>>>
>>>
>>> Best regards, Vladimir Yudovin,
>>>
>>>
>>> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud Cassandra
>>> Launch your cluster in minutes.*
>>>
>>>
>>>
>>>
>>>
>>> ---- On Mon, 14 Nov 2016 03:02:21 -0500*Lu, Boying <Boying.Lu@dell.com
>>> <Boying.Lu@dell.com>>* wrote ----
>>>
>>>
>>>
>>> Hi, All,
>>>
>>>
>>>
>>> Will the Cassandra generates a new tombstone when updating a column by
>>> using CQL update statement?
>>>
>>>
>>>
>>> And is there any way to get the number of tombstones of a column family
>>> since we want to void generating
>>>
>>> too many tombstones within gc_grace_period?
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Boying
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Close the World, Open the Net
>> http://www.linux-wizard.net
>>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>

Mime
View raw message