incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: clarification of the consistency guarantees of Counters
Date Tue, 31 May 2011 08:21:47 GMT
thanks Sylvain, I agree with what you said for the first few paragraphs ----
Jeremy corrected me just now.

regarding the last point, you are right in using the term "by operation",
but you should also note that it's a leader
"data ownership", in the meaning that the leader has the authoritative power
when it comes to reconciliation on that
bucket of count owned by the leader -----  yes you've convinced me that we
DO need to use CL > ONE, but for the sake of
argument, if CL = ONE is used, the leader's data loss causes the other
replicas to not being able to reconcile, that's what I mean.
but anyway it's not relevant now since CL can be > ONE


but I'd really appreciate if you could give some review to my newer post on
FIFO, I think that could be an interesting approach


yang


On Tue, May 31, 2011 at 12:59 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:
>
> >apart from the questions, some thoughts on Counters:
> >the idea of distributed counters can be seen, in distributed algorithms
> terms, as a state machine (see Fred Schneider 93'),  where ideally we send
> the messages (delta increments) to each node, and the final state (sum of
> deltas, or the counter value) is deduced independently at each node.  in the
> current implementation, it's really not a distributed state machine, since
> state is deduced only at the leader, and what is replicated is just the
> final state. in fact, the data from different leaders are orthogonal, and
> within the data flow from one leader, it's really just a master-slave
> system. then we realize that this system is prone to single master failure.
>
> Don't get fooled by the term 'leader': there is one leader *by
> operation*, not one global leader. Again, the leader of an operation
> is really just 'the first of the replica we're replicating to'.
>
> It's not more a master-slave design than regular writes are because
> they use a distinguished coordinator node for each operation. And it's
> not prone to single node failure because if you do counter increments
> at CL.QUORUM against say a cluster with RF=3, then you will still be
> able to write and read even if one node is down and which node exactly
> doesn't matter at all.
>
> --
> Sylvain
>

Mime
View raw message