cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: clarification of the consistency guarantees of Counters
Date Tue, 31 May 2011 03:16:45 GMT
thanks, got it

I looked at the code more closely, the response handler between the
coordinator and itself as leader,  and between leader and replicas, are
shared,
so the coordinator can indeed wait for the count replication to finish for
ALL

yang
On Mon, May 30, 2011 at 6:51 PM, Jeremy Hanna <jeremy.hanna1234@gmail.com>wrote:

> Some more recent documentation can be found here:
> http://wiki.apache.org/cassandra/Counters but even that may be out of
> date.  One thing that has been added is multiple consistency levels are
> supported.  There are a lot of other tickets that have been completed post
> 1072.  Search for "cassandra counter" in Jira and you'll likely find a lot
> of them.
>
> Others may have more info than this, but just wanted to get you started.
>
> Jeremy
>
> On May 30, 2011, at 7:57 PM, Yang wrote:
>
> > I went through https://issues.apache.org/jira/browse/CASSANDRA-1072
> >
> > and realize that the consistency guarantees of Counters are a bit
> different from those of regular columns, so could you please confirm
> > that the following are true?
> >
> > 1) comment
> https://issues.apache.org/jira/browse/CASSANDRA-1072?focusedCommentId=12900659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12900659
> > still holds : "there is no way to create a write CL greater than ONE, and
> thus, no defense against permanent failures of single machines"
> > 2) due to the above, the best I can achieve to increase reliability is to
> enable REPLICATE_ON_WRITE, but this  would still expose the recent updates
> on the leader to being lost during a short interval
> > 3) without REPLICATE_ON_WRITE (or equivalently, read repair ) I would
> have to do CL=ALL on read. then in this case, if the leader fails, all
> future reads fail. so for counters I have to enable
> > REPLICATE_ON_WRITE or set read_repair chance to a reasonably high value,
> and do read CL!= ALL.
> >
> >
> >
> >
> >
> >
> > apart from the questions, some thoughts on Counters:
> > the idea of distributed counters can be seen, in distributed algorithms
> terms, as a state machine (see Fred Schneider 93'),  where ideally we send
> the messages (delta increments) to each node, and the final state (sum of
> deltas, or the counter value) is deduced independently at each node.  in the
> current implementation, it's really not a distributed state machine, since
> state is deduced only at the leader, and what is replicated is just the
> final state. in fact, the data from different leaders are orthogonal, and
> within the data flow from one leader, it's really just a master-slave
> system. then we realize that this system is prone to single master failure.
> >
> > if we want to build a truely distributed state machine, I am afraid there
> are no easier/faster solutions than existing ones (Paxos, etc). But I guess
> that a possible solution could lay in the fact that our goal allows for a
> relaxation than traditional state machine: Eventually consistent, and also
> that our operations are commutative ( re-ordering 2 adds yields the same
> state , when we apply the state changes ). how we take advantage of these
> facts could probably enable us to come to a truely distributed counters
> solution.
> >
> > the route of keeping all individual updates at each node has been
> mentioned in the JIRA, and later do reconciliation on the history. because
> messages losses are less common than success, maybe this is not as bad a
> route as we thought??
> >
> >
> > Thanks
> > Yang
> >
>
>

Mime
View raw message