Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 80056 invoked from network); 19 Jan 2011 01:53:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Jan 2011 01:53:49 -0000 Received: (qmail 62695 invoked by uid 500); 19 Jan 2011 01:53:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62680 invoked by uid 500); 19 Jan 2011 01:53:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62672 invoked by uid 99); 19 Jan 2011 01:53:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jan 2011 01:53:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of german.kondolf@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jan 2011 01:53:42 +0000 Received: by wwa36 with SMTP id 36so318040wwa.25 for ; Tue, 18 Jan 2011 17:53:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=THVCfM0s3xOKWKrFnxr1cEvSmjVn4C0mDlCvvmNTFUo=; b=a6UaaPcAn8l9e5U3lgOKrZbk3TKho0MlQgS+3YCKmEB/h5ZL1BilNSQ7/DrJ0FePTl fgRWHPFqNtiWeblP41uzTbBeGhIK+kiVv8FsfFopcb+6+q1tYRMDmlPTt3E46ju2GkKo LlUDSEq1lYWz4hYapxGWRLuA8EMG0qW9rl33w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=w9Rd7KMLehwc6kSrZlOZkNW0v8a89Zof3NoSZjBhcsHle1p7LvgGUaKfqCFVE1oFBN t07Z8Oq7cs0BHcj632ZX9J+/5FRswZ6KmztTG6WElePB4WD+s1RasPjsn/5D3pTbgJji 7PREnB8K/XsIiEy4WQtHfTGEy3L/4xe0H13fY= MIME-Version: 1.0 Received: by 10.227.147.209 with SMTP id m17mr72748wbv.108.1295402000791; Tue, 18 Jan 2011 17:53:20 -0800 (PST) Received: by 10.227.147.144 with HTTP; Tue, 18 Jan 2011 17:53:20 -0800 (PST) In-Reply-To: References: <85DBB985-911B-4A1E-A912-7096B44F8366@thelastpickle.com> Date: Tue, 18 Jan 2011 22:53:20 -0300 Message-ID: Subject: Re: Tombstone lifespan after multiple deletions From: =?ISO-8859-1?Q?Germ=E1n_Kondolf?= To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Maybe it could be taken into account when the compaction is executed, if I only have a consecutive list of uninterrupted tombstones it could only care about the first. It sounds like the-way-it-should-be, maybe as a part of the "row-reduce" process. Is it feasible? Looking into the CASSANDRA-1074 sounds like it should. //GK http://twitter.com/germanklf http://code.google.com/p/seide/ On Tue, Jan 18, 2011 at 10:55 AM, Sylvain Lebresne wrote: > On Tue, Jan 18, 2011 at 2:41 PM, David Boxenhorn wrote: >> Thanks, Aaron, but I'm not 100% clear. >> >> My situation is this: My use case spins off rows (not columns) that I no >> longer need and want to delete. It is possible that these rows were never >> created in the first place, or were already deleted. This is a very large >> cleanup task that normally deletes a lot of rows, and the last thing that I >> want to do is create tombstones for rows that didn't exist in the first >> place, or lengthen the life on disk of tombstones of rows that are already >> deleted. >> >> So the question is: before I delete, do I have to retrieve the row to see if >> it exists in the first place? > > Yes, in your situation you do. > >> >> >> >> On Tue, Jan 18, 2011 at 11:38 AM, Aaron Morton >> wrote: >>> >>> AFAIK that's not necessary, there is no need to worry about previous >>> deletes. You can delete stuff that does not even exist, neither batch_mutate >>> or remove are going to throw an error. >>> All the columns that were (roughly speaking) present at your first >>> deletion will be available for GC at the end of the first tombstones life. >>> Same for the second. >>> Say you were to write a col between the two deletes with the same name as >>> one present at the start. The first version of the col is avail for GC after >>> tombstone 1, and the second after tombstone 2. >>> Hope that helps >>> Aaron >>> On 18/01/2011, at 9:37 PM, David Boxenhorn wrote: >>> >>> Thanks. In other words, before I delete something, I should check to see >>> whether it exists as a live row in the first place. >>> >>> On Tue, Jan 18, 2011 at 9:24 AM, Ryan King wrote: >>>> >>>> On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn >>>> wrote: >>>> > If I delete a row, and later on delete it again, before GCGraceSeconds >>>> > has >>>> > elapsed, does the tombstone live longer? >>>> >>>> Each delete is a new tombstone, which should answer your question. >>>> >>>> -ryan >>>> >>>> > In other words, if I have the following scenario: >>>> > >>>> > GCGraceSeconds = 10 days >>>> > On day 1 I delete a row >>>> > On day 5 I delete the row again >>>> > >>>> > Will the tombstone be removed on day 10 or day 15? >>>> > >>> >> >> >