cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <SEAN_R_DUR...@homedepot.com>
Subject RE: Insert Vs Updates - Both create tombstones
Date Thu, 14 May 2015 13:17:34 GMT
I think you have over-simplified it just a bit here, though I may be wrong.

In order to get a tombstone on a TTL row or column, some kind of read has to occur. The tombstones
don’t magically appear (remember, a tombstone is a special kind of insert). So, I think
it takes at least two compactions to actually get rid of the data. One compaction to create
tombstones, and a second one (after gc_grace_seconds) to rewrite the good data and leave out
expired tombstones. And it has to happen on all replica nodes.

(I suppose it is possible that other kinds of reads might create TTL-based tombstones as an
optimization, but I don’t know that this exists.)

Someone closer to the source code may correct me, but this is my understanding of how it works.

Sean Durity – Cassandra Admin, Big Data Team
To engage the team, create a request<https://portal.homedepot.com/sites/bigdata/SitePages/Big%20Data%20Engagement%20Request.aspx>

From: Walsh, Stephen [mailto:Stephen.Walsh@Aspect.com]
Sent: Thursday, May 14, 2015 6:37 AM
To: user@cassandra.apache.org
Subject: RE: Insert Vs Updates - Both create tombstones

Thanks ☺

I think you might have got your T’s and V’s mixed up ?

So we insert V2 @ T2, then insert V1 @ T1 where T1 is earlier to T2 = V2

Should it not be the  other way around?

So we insert V1 @ T1, then insert V2 @ T2 where T2 is earlier to T2 = V2


So in a tombstone manor over 5 seconds we are looking like this

Second 1
Insert <V1, T1> with TTL =5

Second 2
<V1, T1> (TTL 4)
Insert <V1, T2> with TTL= 5

Second 3
<V1, T1> (TTL 3)
<V1, T2> (TTL 4)
Insert <V1, T3> with TTL= 5

Second 3
<V1, T1> (TTL 2)
<V1, T2> (TTL 3)
<V1, T3> (TTL 4)
Insert <V1, T4> with TTL= 5

Second 4
<V1, T1> (TTL 1)
<V1, T2> (TTL 2)
<V1, T3> (TTL 3)
<V1, T4> (TTL 4)
Insert <V1, T5> with TTL= 5

Second 5
<V1, T1> (Tombstoned)
<V1, T2> (TTL 1)
<V1, T3> (TTL 2)
<V1, T4> (TTL 3)
<V1, T5> (TTL 4)

Second 6
<V1, T1> (Tombstoned)
<V1, T2> (Tombstoned)
<V1, T3> (TTL 1)
<V1, T4> (TTL 2)
<V1, T5> (TTL 3)

Second 7
<V1, T1> (Tombstoned)
<V1, T2> (Tombstoned)
<V1, T3> (Tombstoned)
<V1, T4> (TTL 1)
<V1, T5> (TTL 2)


Second 8
<V1, T1> (Tombstoned)
<V1, T2> (Tombstoned)
<V1, T3> (Tombstoned)
<V1, T4> (Tombstoned)
<V1, T5> (TTL 1)

Second 8
<V1, T1> (Tombstoned)
<V1, T2> (Tombstoned)
<V1, T3> (Tombstoned)
<V1, T4> (Tombstoned)
<V1, T5> (Tombstoned)

Second 9
(Minor Compaction run to clean up tombstones)


And if I did an “update“, the result would be the same.
And like you mentioned, if I did a query at “second 4”, the query would be based of 5
versions of V1 to query against, and the highest T value would be returned.




From: Peer, Oded [mailto:Oded.Peer@rsa.com]
Sent: 14 May 2015 11:12
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Insert Vs Updates - Both create tombstones

If this how you update then you are not creating tombstones.

If you used UPDATE it’s the same behavior. You are simply inserting a new value for the
cell which does not create a tombstone.
When you modify data by using either the INSERT or the UPDATE command the value is stored
along with a timestamp indicating the timestamp of the value.
Assume timestamp T1 is before T2 (T1 < T2) and you stored value V2 with timestamp T2. Then
you store V1 with timestamp T1.
Now you have two values of V in the DB: <V2,T2>, <V1,T1>
When you read the value of V from the DB you read both <V2,T2>, <V1,T1>, which
may be in different sstables, Cassandra resolves the conflict by comparing the timestamp and
returns V2.
Compaction will later take care and remove <V1,T1> from the DB.


From: Walsh, Stephen [mailto:Stephen.Walsh@Aspect.com]
Sent: Thursday, May 14, 2015 11:39 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Insert Vs Updates - Both create tombstones

Thank you,

We are updating the entire row (all columns) each second via the “insert” command.
So if we did updates – no tombstones would be created?
But because we are doing inserts- we are creating tombstones for each column each insert?


From: Ali Akhtar [mailto:ali.rac200@gmail.com]
Sent: 13 May 2015 12:10
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Insert Vs Updates - Both create tombstones

Sorry, wrong thread. Disregard the above

On Wed, May 13, 2015 at 4:08 PM, Ali Akhtar <ali.rac200@gmail.com<mailto:ali.rac200@gmail.com>>
wrote:
If specifying 'using' timestamp, the docs say to provide microseconds, but where are these
microseconds obtained from? I have regular java.util.Date objects, I can get the time in milliseconds
(i.e the unix timestamp), how would I convert that to microseconds?

On Wed, May 13, 2015 at 3:45 PM, Peer, Oded <Oded.Peer@rsa.com<mailto:Oded.Peer@rsa.com>>
wrote:
Under the assumption that when you update the columns you also update the TTL for the columns
then a tombstone won’t be created for those columns.
Remember that TTL is set on columns (or “cells”), not on rows, so your description of
updating a row is slightly misleading. If every query updates different columns then different
columns might expire at different times.

From: Walsh, Stephen [mailto:Stephen.Walsh@Aspect.com<mailto:Stephen.Walsh@Aspect.com>]
Sent: Wednesday, May 13, 2015 1:35 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Insert Vs Updates - Both create tombstones

Quick Question,

Our team is under much debate, we are trying to find out if an Update on a row with a TTL
will create a tombstone.

E.G

We have one row with a TTL, if we keep “updating” that row before the TTL is hit, will
a tombstone be created.
I believe it will, but want to confirm.

So if that’s is  true,
And if our TTL is 10 seconds and we “update” the row every second, will 10 tombstones
be created after 10 seconds? Or just 1?
(and does the same apply for “insert”)

Regards
Stephen Walsh


This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.


This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.
This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain
information that is confidential. If you have received this message in error, please do not
read, copy or forward this message. Please notify the sender immediately, delete it from your
system and destroy any copies. You may not further disclose or distribute this email or its
attachments.

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is
intended solely for the addressee. Access to this Email by anyone else is unauthorized. If
you are not the intended recipient, any disclosure, copying, distribution or any action taken
or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed
to our clients any opinions or advice contained in this Email are subject to the terms and
conditions expressed in any applicable governing The Home Depot terms of business or client
engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy
and content of this attachment and for any damages or losses arising from any inaccuracies,
errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature,
which may be contained in this attachment and shall not be liable for direct, indirect, consequential
or special damages in connection with this e-mail message or its attachment.
Mime
View raw message