From user-return-35709-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Aug 1 21:27:18 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D5843104DF for ; Thu, 1 Aug 2013 21:27:18 +0000 (UTC) Received: (qmail 81965 invoked by uid 500); 1 Aug 2013 21:27:16 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 81934 invoked by uid 500); 1 Aug 2013 21:27:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 81925 invoked by uid 99); 1 Aug 2013 21:27:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Aug 2013 21:27:16 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eforkalsrud@cj.com designates 64.70.58.141 as permitted sender) Received: from [64.70.58.141] (HELO smtp.vclk.net) (64.70.58.141) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Aug 2013 21:27:09 +0000 Received: from mip1.netscaler7-8.la.vclk.net (HELO foxy.cj.com) ([192.168.137.118]) by smtp.vclk.net with ESMTP; 01 Aug 2013 13:26:47 -0800 Received: from [192.168.15.143] ([192.168.15.143]) by foxy.cj.com (8.11.0/8.11.0) with ESMTP id r71LQlt08411 for ; Thu, 1 Aug 2013 14:26:47 -0700 Message-ID: <51FAD2A1.1060403@cj.com> Date: Thu, 01 Aug 2013 14:26:57 -0700 From: Erik Forkalsud User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: How often to run `nodetool repair` References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------060604080700080505010006" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------060604080700080505010006 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 08/01/2013 01:16 PM, Andrey Ilinykh wrote: > > TTL is effectively DELETE; you need to run a repair once every > gc_grace_seconds. If you don't, data might un-delete itself. > > > How is it possible? Every replica has TTL, so it when it expires every > replica has tombstone. I don't see how you can get data with no > tombstone. What do I miss? > The only way I can think of is this scenario: - value "A" for some key is written with ttl=30days, to all replicas (i.e a long ttl or no ttl at all) - value "B" for the same key is written with ttl=1day, but doesn't reach all replicas - one day passes and the ttl=1day values turn into deletes - gc_grace passes and the tombstones are purged at this point, the replica that didn't get the ttl=1day value will think the older value "A" is live. I'm no expert on this so I may be mistaken, but in any case it's a corner case as overwriting columns with shorter ttls would be unusual. - Erik - --------------060604080700080505010006 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit
On 08/01/2013 01:16 PM, Andrey Ilinykh wrote:

TTL is effectively DELETE; you need to run a repair once every gc_grace_seconds. If you don't, data might un-delete itself. 

How is it possible? Every replica has TTL, so it when it expires every replica has tombstone. I don't see how you can get data with no tombstone. What do I miss?


The only way I can think of is this scenario:

   - value "A" for some key is written with ttl=30days, to all replicas   (i.e a long ttl or no ttl at all)
   - value "B" for the same key is written with ttl=1day, but doesn't reach all replicas
   - one day passes and the ttl=1day values turn into deletes
   - gc_grace passes and the tombstones are purged

at this point, the replica that didn't get the ttl=1day value will think the older value "A" is live.

I'm no expert on this so I may be mistaken, but in any case it's a corner case as overwriting columns with shorter ttls would be unusual.


- Erik -

--------------060604080700080505010006--