Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9A4E87764 for ; Thu, 28 Jul 2011 15:24:33 +0000 (UTC) Received: (qmail 54881 invoked by uid 500); 28 Jul 2011 15:24:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 54814 invoked by uid 500); 28 Jul 2011 15:24:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 54803 invoked by uid 99); 28 Jul 2011 15:24:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jul 2011 15:24:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sylvain@datastax.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jul 2011 15:24:22 +0000 Received: by gxk19 with SMTP id 19so2300902gxk.31 for ; Thu, 28 Jul 2011 08:24:01 -0700 (PDT) Received: by 10.236.80.72 with SMTP id j48mr154342yhe.244.1311866641222; Thu, 28 Jul 2011 08:24:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.236.103.180 with HTTP; Thu, 28 Jul 2011 08:23:41 -0700 (PDT) X-Originating-IP: [88.183.33.171] In-Reply-To: References: From: Sylvain Lebresne Date: Thu, 28 Jul 2011 17:23:41 +0200 Message-ID: Subject: Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org To be honest, collecting the names that were missing in the first name query and doing a new name query for those (if there is any) is so simple that I think it is a bit dishonest to say that "it pushes work to the clien= ts". It seems simple enough at least that it does not sound like a good idea to me to complicate the API. I'll also argue that throwing an exception is always inferior to the curren= t behavior, because if you re-query the missing names and get nothing back again, you at least know what are the names that are not missing. If you throw an exception and you simply get the exception twice, you only know that "some" column(s) is(are) missing. -- Sylvain On Thu, Jul 28, 2011 at 4:53 PM, David Allsopp wrote: > I understand and agree for the case where the slice predicate is a range, > but I'd expect the semantics to be different where the predicate is a lis= t > of column names (even if it's implemented using a range operation under t= he > hood?) > > If I ask for columns "foo" and "bar", then usually I'm not trying to find > out what's in a particular range - I actually want columns "foo" AND "bar= ", > i.e. the semantics are basically those of a set of individual column get(= ) > calls. > > I could do these as individual get() calls, but want to minimise > round-trips. > > I can of course check what column were returned and try again or give up, > but this pushes work to the clients; in the worst case this could transfe= r > large amounts of unusable data back to the client, which then has to disc= ard > it all (and perhaps retry and discard all over again) due to the absence = of > one small column. It would save a lot of bandwidth to abandon the operati= on > immediately at the server if a 'missing' column is detected there. > > Of course, in some use cases one might want to get whichever of the colum= ns > names happen to exist ("foo" AND/OR "bar"), hence my suggestion that it > should be possible to choose between these two semantics when using a > column_names predicate (clearly, this doesn't make sense for a slice_rang= e > predicate). > > On 28 July 2011 13:45, Jonathan Ellis wrote: >> >> No, the slice semantics are "give me whatever happens to exist between >> start and end." =A0It's valid for the answer to be "nothing." >> >> On Thu, Jul 28, 2011 at 6:55 AM, David Allsopp >> wrote: >> > If I try to retrieve a column that is not present, using get(), then >> > I'll >> > get a NotFoundException. >> > >> > If (for efficiency's sake) I try to retrieve several named columns usi= ng >> > get_slice, with a column_names predicate (i.e. a list of columns) then= I >> > won't get the exception if one of those columns is missing, I think? >> > >> > This seems inconsistent - would it make sense for get_slice to throw t= he >> > exception too, or perhaps have an option to require all columns to be >> > present? >> > >> > >> > The reason this came up is that I write and read with CL.ONE, and retr= y >> > at >> > the client side in case of (very occasional) failures, with the aim of >> > improving availability and performance by avoiding CL.QUORUM etc. >> > This is easy in the get() case - I can just retry a few times if I get= a >> > NotFoundException. I normally only need to retry once, in less than 0.= 1% >> > of >> > cases. >> > >> > For the get_slice case I'd need to retrieve all the columns again (mig= ht >> > be >> > wasteful) or check which ones were returned and form a new request >> > (seems >> > overly complex) or give up using get_slice and just use individual get= () >> > calls (seems inefficient). >> > >> > See also https://issues.apache.org/jira/browse/CASSANDRA-518 >> > >> > Thanks, >> > >> > David. >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com > >