cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Germán Kondolf <german.kond...@gmail.com>
Subject Re: Tombstone lifespan after multiple deletions
Date Wed, 19 Jan 2011 01:53:20 GMT
Maybe it could be taken into account when the compaction is executed,
if I only have a consecutive list of uninterrupted tombstones it could
only care about the first. It sounds like the-way-it-should-be, maybe
as a part of the "row-reduce" process.

Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.

//GK
http://twitter.com/germanklf
http://code.google.com/p/seide/

On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sylvain@riptano.com> wrote:
> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <david@lookin2.com> wrote:
>> Thanks, Aaron, but I'm not 100% clear.
>>
>> My situation is this: My use case spins off rows (not columns) that I no
>> longer need and want to delete. It is possible that these rows were never
>> created in the first place, or were already deleted. This is a very large
>> cleanup task that normally deletes a lot of rows, and the last thing that I
>> want to do is create tombstones for rows that didn't exist in the first
>> place, or lengthen the life on disk of tombstones of rows that are already
>> deleted.
>>
>> So the question is: before I delete, do I have to retrieve the row to see if
>> it exists in the first place?
>
> Yes, in your situation you do.
>
>>
>>
>>
>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <aaron@thelastpickle.com>
>> wrote:
>>>
>>> AFAIK that's not necessary, there is no need to worry about previous
>>> deletes. You can delete stuff that does not even exist, neither batch_mutate
>>> or remove are going to throw an error.
>>> All the columns that were (roughly speaking) present at your first
>>> deletion will be available for GC at the end of the first tombstones life.
>>> Same for the second.
>>> Say you were to write a col between the two deletes with the same name as
>>> one present at the start. The first version of the col is avail for GC after
>>> tombstone 1, and the second after tombstone 2.
>>> Hope that helps
>>> Aaron
>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <david@lookin2.com> wrote:
>>>
>>> Thanks. In other words, before I delete something, I should check to see
>>> whether it exists as a live row in the first place.
>>>
>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ryan@twitter.com> wrote:
>>>>
>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <david@lookin2.com>
>>>> wrote:
>>>> > If I delete a row, and later on delete it again, before GCGraceSeconds
>>>> > has
>>>> > elapsed, does the tombstone live longer?
>>>>
>>>> Each delete is a new tombstone, which should answer your question.
>>>>
>>>> -ryan
>>>>
>>>> > In other words, if I have the following scenario:
>>>> >
>>>> > GCGraceSeconds = 10 days
>>>> > On day 1 I delete a row
>>>> > On day 5 I delete the row again
>>>> >
>>>> > Will the tombstone be removed on day 10 or day 15?
>>>> >
>>>
>>
>>
>

Mime
View raw message