cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lu, Boying" <Boying...@dell.com>
Subject RE: Some questions to updating and tombstone
Date Thu, 17 Nov 2016 09:33:40 GMT
Very appreciate to all of you, I’ll study the blog.

From: Alain RODRIGUEZ [mailto:arodrime@gmail.com]
Sent: 2016年11月16日 23:26
To: user@cassandra.apache.org
Cc: Fabrice Facorat
Subject: Re: Some questions to updating and tombstone

Hi Boying,

Old value is not tombstone, but remains until compaction

Be careful, the above is generally true but not necessary.

Tombstones can actually be generated while using update in some corner cases. Using collections
or prepared statements.

I wrote a detailed blog post about deletes and tombstones in Cassandra precisely to avoid
answering this kind of question again and again on the mailing list, as explaining correctly
is a bit hard and I am a lazy guy. I also talked about it at the last Cassandra summit. If
you are going to use Cassandra (and deletes) I think one of these might be of interest to
you:

http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
https://www.youtube.com/watch?v=lReTEcnzl7Y

If you still have questions after reading it, I would be very pleased to help you further,
but I believe this should be helpful.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com<mailto:alain@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2016-11-16 10:15 GMT+01:00 Shalom Sagges <shaloms@liveperson.com<mailto:shaloms@liveperson.com>>:
Hi Fabrice,

Just a small (out of the topic) question I couldn't find an answer to. What is a slice in
Cassandra? (e.g. Maximum tombstones per slice)

Thanks!


[Image removed by sender.]

Shalom Sagges

DBA

T: +972-74-700-4035<tel:%2B972-74-700-4035>

[Image removed by sender.]<http://www.linkedin.com/company/164748>

[Image removed by sender.]<http://twitter.com/liveperson>

[Image removed by sender.]<http://www.facebook.com/LivePersonInc>


We Create Meaningful Connections


[Image removed by sender.]<https://engage.liveperson.com/idc-mobile-first-consumer/?utm_medium=email&utm_source=mkto&utm_campaign=idcsig>



On Tue, Nov 15, 2016 at 6:38 PM, Fabrice Facorat <fabrice.facorat@gmail.com<mailto:fabrice.facorat@gmail.com>>
wrote:
If you don't want tombstones, don't generate them ;)
More seriously, tombstones are generated when:
- doing a DELETE
- TTL expiration
- set a column to NULL

However tombstones are an issue only if for the same value, you have many tombstones (i.e
you keep overwriting the same values with datas and tombstones). Having 1 tombstone for 1
value is not an issue, having 1000 tombstone for 1 value is a problem. Do really your use
case overwrite data with DELETE or  NULL ?
So that's why what you may want to know is how many tombstones you have on average when reading
a value. This is available in:
- nodetool cfstats ks.cf<http://ks.cf> : Average tombstones per slice/Maximum tombstones
per slice
- JMX : org.apache.cassandra.metrics:keyspace=<ks>,name=TombstoneScannedHistogram,scope=<cf>,type=ColumnFamily
Max/Count/99thPercentile/Mean

2016-11-15 10:05 GMT+01:00 Lu, Boying <Boying.Lu@dell.com<mailto:Boying.Lu@dell.com>>:
Thanks a lot for your help.

We are using STCS strategy and not using TTL

Is there any API that we can use to query the current number of tombstones in a CF?



From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in<mailto:anujw_2003@yahoo.co.in>]
Sent: 2016年11月14日 22:20
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Some questions to updating and tombstone

Hi Boying,

I agree with Vladimir.If compaction is not compacting the two sstables with updates soon,
disk space issues will be wasted. For example, if the updates are not closer in time, first
update might be in a big table by the time second update is being written in a new small table.
STCS wont compact them together soon.

Just adding column values with new timestamp shouldnt create any tombstones. But if data is
not merged for long, disk space issues may arise. If you are STCS,just  yo get an idea about
the extent of the problem you can run major compaction and see the amount of disk space created
with that( dont do this in production as major compaction has its own side effects).

Which compaction strategy are you using?
Are these updates done with TTL?

Thanks
Anuj

On Mon, 14 Nov, 2016 at 1:54 PM, Vladimir Yudovin
<vladyu@winguzone.com<mailto:vladyu@winguzone.com>> wrote:
Hi Boying,

UPDATE write new value with new time stamp. Old value is not tombstone, but remains until
compaction. gc_grace_period is not related to this.

Best regards, Vladimir Yudovin,
Winguzone<https://winguzone.com?from=list> - Hosted Cloud Cassandra
Launch your cluster in minutes.


---- On Mon, 14 Nov 2016 03:02:21 -0500Lu, Boying <Boying.Lu@dell.com<mailto:Boying.Lu@dell.com>>
wrote ----

Hi, All,

Will the Cassandra generates a new tombstone when updating a column by using CQL update statement?

And is there any way to get the number of tombstones of a column family since we want to void
generating
too many tombstones within gc_grace_period?

Thanks

Boying



--
Close the World, Open the Net
http://www.linux-wizard.net


This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the addressee you
must not use, copy, disclose or take action based on this message or any information herein.
If you have received this message in error, please advise the sender immediately by reply
email and delete this message. Thank you.

Mime
View raw message