cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: Tombstone lifespan after multiple deletions
Date Wed, 19 Jan 2011 03:59:10 GMT
On Wed, Jan 19, 2011 at 11:35 AM, Germán Kondolf
<german.kondolf@gmail.com>wrote:

> Yes, that's what I meant, but correct me if I'm wrong, when a deletion
> comes after another deletion for the same row or column will the gc-before
> count against the last one, isn't it?
>
> IIRC, after compaction. even if the row key is not wiped, all the CF are
replaced by the youngest tombstone.  I do not understand very clearly the
benefit of wiping out the whole row as early as possible.


>
> Maybe knowing that all the subsequent versions of a deletion are deletions
> too, it could take the first timestamp against the gc-grace-seconds when is
> reducing & compacting.
>
> // Germán Kondolf
> http://twitter.com/germanklf
> http://code.google.com/p/seide/
> // @i4
>
> On 19/01/2011, at 00:16, Jonathan Ellis <jbellis@gmail.com> wrote:
>
> > If you mean that multiple tombstones for the same row or column should
> > be merged into a single one at compaction time, then yes, that is what
> > happens.
> >
> > On Tue, Jan 18, 2011 at 7:53 PM, Germán Kondolf
> > <german.kondolf@gmail.com> wrote:
> >> Maybe it could be taken into account when the compaction is executed,
> >> if I only have a consecutive list of uninterrupted tombstones it could
> >> only care about the first. It sounds like the-way-it-should-be, maybe
> >> as a part of the "row-reduce" process.
> >>
> >> Is it feasible? Looking into the CASSANDRA-1074 sounds like it should.
> >>
> >> //GK
> >> http://twitter.com/germanklf
> >> http://code.google.com/p/seide/
> >>
> >> On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne <sylvain@riptano.com>
> wrote:
> >>> On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn <david@lookin2.com>
> wrote:
> >>>> Thanks, Aaron, but I'm not 100% clear.
> >>>>
> >>>> My situation is this: My use case spins off rows (not columns) that
I
> no
> >>>> longer need and want to delete. It is possible that these rows were
> never
> >>>> created in the first place, or were already deleted. This is a very
> large
> >>>> cleanup task that normally deletes a lot of rows, and the last thing
> that I
> >>>> want to do is create tombstones for rows that didn't exist in the
> first
> >>>> place, or lengthen the life on disk of tombstones of rows that are
> already
> >>>> deleted.
> >>>>
> >>>> So the question is: before I delete, do I have to retrieve the row to
> see if
> >>>> it exists in the first place?
> >>>
> >>> Yes, in your situation you do.
> >>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton <
> aaron@thelastpickle.com>
> >>>> wrote:
> >>>>>
> >>>>> AFAIK that's not necessary, there is no need to worry about previous
> >>>>> deletes. You can delete stuff that does not even exist, neither
> batch_mutate
> >>>>> or remove are going to throw an error.
> >>>>> All the columns that were (roughly speaking) present at your first
> >>>>> deletion will be available for GC at the end of the first tombstones
> life.
> >>>>> Same for the second.
> >>>>> Say you were to write a col between the two deletes with the same
> name as
> >>>>> one present at the start. The first version of the col is avail
for
> GC after
> >>>>> tombstone 1, and the second after tombstone 2.
> >>>>> Hope that helps
> >>>>> Aaron
> >>>>> On 18/01/2011, at 9:37 PM, David Boxenhorn <david@lookin2.com>
> wrote:
> >>>>>
> >>>>> Thanks. In other words, before I delete something, I should check
to
> see
> >>>>> whether it exists as a live row in the first place.
> >>>>>
> >>>>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King <ryan@twitter.com>
wrote:
> >>>>>>
> >>>>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn <david@lookin2.com
> >
> >>>>>> wrote:
> >>>>>>> If I delete a row, and later on delete it again, before
> GCGraceSeconds
> >>>>>>> has
> >>>>>>> elapsed, does the tombstone live longer?
> >>>>>>
> >>>>>> Each delete is a new tombstone, which should answer your question.
> >>>>>>
> >>>>>> -ryan
> >>>>>>
> >>>>>>> In other words, if I have the following scenario:
> >>>>>>>
> >>>>>>> GCGraceSeconds = 10 days
> >>>>>>> On day 1 I delete a row
> >>>>>>> On day 5 I delete the row again
> >>>>>>>
> >>>>>>> Will the tombstone be removed on day 10 or day 15?
> >>>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
>
>

Mime
View raw message