cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabrice Facorat <fabrice.faco...@gmail.com>
Subject Re: Some questions to updating and tombstone
Date Tue, 15 Nov 2016 16:38:46 GMT
If you don't want tombstones, don't generate them ;)

More seriously, tombstones are generated when:
- doing a DELETE
- TTL expiration
- set a column to NULL

However tombstones are an issue only if for the same value, you have many
tombstones (i.e you keep overwriting the same values with datas and
tombstones). Having 1 tombstone for 1 value is not an issue, having 1000
tombstone for 1 value is a problem. Do really your use case overwrite data
with DELETE or  NULL ?

So that's why what you may want to know is how many tombstones you have on
average when reading a value. This is available in:
- nodetool cfstats ks.cf : Average tombstones per slice/Maximum tombstones
per slice
- JMX :
org.apache.cassandra.metrics:keyspace=<ks>,name=TombstoneScannedHistogram,scope=<cf>,type=ColumnFamily
Max/Count/99thPercentile/Mean


2016-11-15 10:05 GMT+01:00 Lu, Boying <Boying.Lu@dell.com>:

> Thanks a lot for your help.
>
>
>
> We are using STCS strategy and not using TTL
>
>
>
> Is there any API that we can use to query the current number of tombstones
> in a CF?
>
>
>
>
>
>
>
> *From:* Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
> *Sent:* 2016年11月14日 22:20
> *To:* user@cassandra.apache.org
> *Subject:* Re: Some questions to updating and tombstone
>
>
>
> Hi Boying,
>
>
>
> I agree with Vladimir.If compaction is not compacting the two sstables
> with updates soon, disk space issues will be wasted. For example, if the
> updates are not closer in time, first update might be in a big table by the
> time second update is being written in a new small table. STCS wont compact
> them together soon.
>
>
>
> Just adding column values with new timestamp shouldnt create any
> tombstones. But if data is not merged for long, disk space issues may
> arise. If you are STCS,just  yo get an idea about the extent of the problem
> you can run major compaction and see the amount of disk space created with
> that( dont do this in production as major compaction has its own side
> effects).
>
>
>
> Which compaction strategy are you using?
>
> Are these updates done with TTL?
>
>
>
> Thanks
> Anuj
>
>
>
> On Mon, 14 Nov, 2016 at 1:54 PM, Vladimir Yudovin
>
> <vladyu@winguzone.com> wrote:
>
> Hi Boying,
>
>
>
> UPDATE write new value with new time stamp. Old value is not tombstone,
> but remains until compaction. gc_grace_period is not related to this.
>
>
>
> Best regards, Vladimir Yudovin,
>
>
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud Cassandra
> Launch your cluster in minutes.*
>
>
>
>
>
> ---- On Mon, 14 Nov 2016 03:02:21 -0500*Lu, Boying <Boying.Lu@dell.com
> <Boying.Lu@dell.com>>* wrote ----
>
>
>
> Hi, All,
>
>
>
> Will the Cassandra generates a new tombstone when updating a column by
> using CQL update statement?
>
>
>
> And is there any way to get the number of tombstones of a column family
> since we want to void generating
>
> too many tombstones within gc_grace_period?
>
>
>
> Thanks
>
>
>
> Boying
>
>
>
>


-- 
Close the World, Open the Net
http://www.linux-wizard.net

Mime
View raw message