cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6870) Transform operation
Date Fri, 21 Mar 2014 09:42:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13942917#comment-13942917
] 

Sylvain Lebresne commented on CASSANDRA-6870:
---------------------------------------------

bq. I was thinking that most applications that have a read-before-write use pattern could
be abstracted into a single RPC slice+function regardless of atomicity.

I'm not sure I understand the point. Are you arguing that we should support read-before-write
outside of paxos? In which case, as I said I'm not sure and I think this deserve it's own
conversation (but I'd rather avoid mixing too much conversation on a single ticket). Or are
you suggesting that users would use functions-through-paxos for their read-before-write needs
even when they don't really care about the serializability? Because in that later case, not
only my first "I'm not sure" point still apply, but on top of that it would be rather inefficient
to use the Paxos path when it's not strictly needed.

bq. I believe CQL lists already support some read-before-write operations.

That's true, but frankly that was kind of a mistake in hindsight and we've talked about removing
those more than once. So far we haven't done it for backward compatibility sake, but we've
been clear in the doc that sets should be preferred to lists as much as possible (and maybe
the doc isn't clear enough, I'm all for improving the warnings). But truly, having made the
mistake once is not a good argument to generalize that mistake, quite the contrary in fact.
But again, I'm not entirely close to having a discussion about "do we want to allow simple
read-before-write operation server side", but I think it's definitively something that require
careful consideration on it's own (my own gut feeling is that it would be a mistake because
it would make way to easy to do the wrong thing, and that letting user do their read-before-write
client side is much less surprising, and make people think harder about how to avoid it in
the first place, which is the way to do it when possible with Cassandra). To put it another
way, I really don't think we should mix a discussion of read-before-write in general, and
Paxos, which does incur a read before write, but does so to offer much stronger guarantees
at a very high performance cost.

bq. You could apply a idempotent function to a list or set like remove_all_odd_numbers()

Note that it didn't said idempotent function didn't existed. I'm saying 1) you don't want
to use the paxos path unless you really care about it's serializability guarantees because
it's really very inefficient if you don't (more than doing a read-before-write client side),
2) when you do use paxos you do have the problem that you need to be able to handle timeout,
which basically means you need to be able to validate manually if you operation did happen
or not. Which imo limit quite a bit the interesting things you can do with functions in the
Paxos path. Increment don't work, and even for remove_all_odd_numbers, given the timeout problem,
I don't see cases where running it through Paxos would give you any benefit (over doing a
non-paxos client side read-before-write; again, let's agree that moving simple/non-paxos read-before-write
server side is a different conversation).

Anyway, what I'm saying is, I'm not opposed to channeling function calls through Paxos in
principle and it's definitively technically possible, but as with every feature, we should
avoid to do it just because we can as this leads to feature creep, so I think it would be
beneficial to first come up with a few concrete example of when it could be used that, example
that make it clear what we win by adding this. Incrementation is imo not one such example
because you can't handle timeouts and thus can't have exact counters.

The other point being, once we do have those example and are convinced this is a net win,
what do we do about the fact that user will obviously be tempted to do things like increment
(and many other things) that won't work as well as it first appear? Do we consider that it's
just a documentation issue? That worries be a bit tbh, I'm afraid this would be misused more
often than not.



> Transform operation
> -------------------
>
>                 Key: CASSANDRA-6870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6870
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>            Priority: Minor
>
> Compare and swap uses paxos to only update a value only if some criteria is met. If I
understand correctly we should be able to use this feature to provide a wider variety of server
side operations. 
> For example inside a paxos transaction performing a slice and then using a function to
manipulate the slice. You could accomplish features like append and increment this way without
user needing to know the current value.
> I took a stab at doing this. I **think** I did it correctly. Comments welcome.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message