Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 70242 invoked from network); 6 May 2010 21:57:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 May 2010 21:57:09 -0000 Received: (qmail 66695 invoked by uid 500); 6 May 2010 21:57:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 66594 invoked by uid 500); 6 May 2010 21:57:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 66586 invoked by uid 99); 6 May 2010 21:57:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 21:57:08 +0000 X-ASF-Spam-Status: No, hits=2.7 required=10.0 tests=AWL,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.160.44] (HELO mail-pw0-f44.google.com) (209.85.160.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 21:57:01 +0000 Received: by pwj2 with SMTP id 2so210370pwj.31 for ; Thu, 06 May 2010 14:56:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.56.6 with SMTP id e6mr7309732rva.81.1273183001039; Thu, 06 May 2010 14:56:41 -0700 (PDT) Received: by 10.140.131.21 with HTTP; Thu, 6 May 2010 14:56:41 -0700 (PDT) In-Reply-To: References: Date: Thu, 6 May 2010 14:56:41 -0700 Message-ID: Subject: Re: pagination through slices with deleted keys From: Mike Malone To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001636b2af6382d3610485f4048a --001636b2af6382d3610485f4048a Content-Type: text/plain; charset=ISO-8859-1 Our solution at SimpleGeo has been to hack Cassandra to (optionally, at least) be sensible and drop Rows that don't have any Columns. The claim from the FAQ that "Cassandra would have to check if there are any other columns in the row" is inaccurate. The common case for us at least is that we're only interested in Rows that have Columns matching our predicate. So if there aren't any, we just don't return that row. No need to check if the entire row is deleted. Mike On Thu, May 6, 2010 at 9:17 AM, Ian Kallen wrote: > I read the DistributedDeletes and the range_ghosts FAQ entry on the wiki > which do a good job describing how difficult deletion is in an eventually > consistent system. But practical application strategies for dealing with it > aren't there (that I saw). I'm wondering how folks implement pagination in > their applications; if you want to render N results in an application, is > the only solution to over-fetch and filter out the tombstones? Or is there > something simpler that I overlooked? I'd like to be able to count (even if > the counts are approximate) and fetch rows with the deleted ones filtered > out (without waiting for the GCGraceSeconds interval + compaction) but from > what I see so far, the burden is on the app to deal with the tombstones. > -Ian > --001636b2af6382d3610485f4048a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Our solution at SimpleGeo has been to hack Cassandra to (optionally, at lea= st) be sensible and drop Rows that don't have any Columns. The claim fr= om the FAQ that "Cassandra would have to check if there are any other = columns in the row" is inaccurate. The common case for us at least is = that we're only interested in Rows that have Columns matching our predi= cate. So if there aren't any, we just don't return that row. No nee= d to check if the entire row is deleted.

Mike

On Thu, May 6, 2010 a= t 9:17 AM, Ian Kallen <spidaman.list@gmail.com> wrote:
I read the DistributedDeletes and the range_ghosts FAQ entry on the wiki wh= ich do a good job describing how difficult deletion is in an eventually con= sistent system. But practical application strategies for dealing with it ar= en't there (that I saw). I'm wondering how folks implement paginati= on in their applications; if you want to render N results in an application= , is the only solution to over-fetch and filter out the tombstones? Or is t= here something simpler that I overlooked? I'd like to be able to count = (even if the counts are approximate) and fetch rows with the deleted ones f= iltered out (without waiting for the GCGraceSeconds interval + compaction) = but from what I see so far, the burden is on the app to deal with the tombs= tones.
-Ian

--001636b2af6382d3610485f4048a--