cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <>
Subject Re: NotFoundException thrown for get(), but not get_slice() with a column_names predicate
Date Thu, 28 Jul 2011 15:23:41 GMT
To be honest, collecting the names that were missing in the first name
query and doing a new name query for those (if there is any) is so simple
that I think it is a bit dishonest to say that "it pushes work to the clients".

It seems simple enough at least that it does not sound like a good idea to
me to complicate the API.

I'll also argue that throwing an exception is always inferior to the current
behavior, because if you re-query the missing names and get nothing
back again, you at least know what are the names that are not missing.
If you throw an exception and you simply get the exception twice, you
only know that "some" column(s) is(are) missing.


On Thu, Jul 28, 2011 at 4:53 PM, David Allsopp <> wrote:
> I understand and agree for the case where the slice predicate is a range,
> but I'd expect the semantics to be different where the predicate is a list
> of column names (even if it's implemented using a range operation under the
> hood?)
> If I ask for columns "foo" and "bar", then usually I'm not trying to find
> out what's in a particular range - I actually want columns "foo" AND "bar",
> i.e. the semantics are basically those of a set of individual column get()
> calls.
> I could do these as individual get() calls, but want to minimise
> round-trips.
> I can of course check what column were returned and try again or give up,
> but this pushes work to the clients; in the worst case this could transfer
> large amounts of unusable data back to the client, which then has to discard
> it all (and perhaps retry and discard all over again) due to the absence of
> one small column. It would save a lot of bandwidth to abandon the operation
> immediately at the server if a 'missing' column is detected there.
> Of course, in some use cases one might want to get whichever of the columns
> names happen to exist ("foo" AND/OR "bar"), hence my suggestion that it
> should be possible to choose between these two semantics when using a
> column_names predicate (clearly, this doesn't make sense for a slice_range
> predicate).
> On 28 July 2011 13:45, Jonathan Ellis <> wrote:
>> No, the slice semantics are "give me whatever happens to exist between
>> start and end."  It's valid for the answer to be "nothing."
>> On Thu, Jul 28, 2011 at 6:55 AM, David Allsopp <>
>> wrote:
>> > If I try to retrieve a column that is not present, using get(), then
>> > I'll
>> > get a NotFoundException.
>> >
>> > If (for efficiency's sake) I try to retrieve several named columns using
>> > get_slice, with a column_names predicate (i.e. a list of columns) then I
>> > won't get the exception if one of those columns is missing, I think?
>> >
>> > This seems inconsistent - would it make sense for get_slice to throw the
>> > exception too, or perhaps have an option to require all columns to be
>> > present?
>> >
>> >
>> > The reason this came up is that I write and read with CL.ONE, and retry
>> > at
>> > the client side in case of (very occasional) failures, with the aim of
>> > improving availability and performance by avoiding CL.QUORUM etc.
>> > This is easy in the get() case - I can just retry a few times if I get a
>> > NotFoundException. I normally only need to retry once, in less than 0.1%
>> > of
>> > cases.
>> >
>> > For the get_slice case I'd need to retrieve all the columns again (might
>> > be
>> > wasteful) or check which ones were returned and form a new request
>> > (seems
>> > overly complex) or give up using get_slice and just use individual get()
>> > calls (seems inefficient).
>> >
>> > See also
>> >
>> > Thanks,
>> >
>> > David.
>> >
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support

View raw message