cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Blake Eggleston (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6246) EPaxos
Date Sun, 28 Sep 2014 23:15:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151256#comment-14151256
] 

Blake Eggleston edited comment on CASSANDRA-6246 at 9/28/14 11:15 PM:
----------------------------------------------------------------------

bq. In the current implementation, we only keep the last commit per CQL partition. We can
do the same for this as well.

Yeah I've been thinking about that some more. Just because we could keep a bunch of historical
data doesn't mean we should. There may be situations where we need to keep more than one instance
around though, specifically when the instance is part of a strongly connected component. Keeping
some historical data would be useful for helping nodes recover from short failures where they
miss several instances, but after a point, transmitting all the activity for the last hour
or two would just be nuts. The other issue with relying on historical data for failure recovery
is that you can't keep all of it, so you'd have dangling pointers on the older instances.


For longer partitions, and nodes joining the ring, if we transmitted our current dependency
bookkeeping for the token ranges they're replicating, the corresponding instances, and the
current values for those instances, that should be enough to get going.

bq. I am also reading about epaxos recently and want to know when do you do the condition
check in your implementation?

It would have to be when the instance is executed.


was (Author: bdeggleston):
bq. In the current implementation, we only keep the last commit per CQL partition. We can
do the same for this as well.

Yeah I've been thinking about that some more. Just because we could keep a bunch of historical
data doesn't mean we should. There may be situations where we need to keep more than one instance
around though, specifically when the instance is part of a strongly connected component. Keeping
some historical data would be useful for helping instances recover from short failures where
they miss several instances, but after a point, transmitting all the activity for the last
hour or two would just be nuts. The other issue with relying on historical data for failure
recovery is that you can't keep all of it, so you'd have dangling pointers on the older instances.


For longer partitions, and nodes joining the ring, if we transmitted our current dependency
bookkeeping for the token ranges they're replicating, the corresponding instances, and the
current values for those instances, that should be enough to get going.

bq. I am also reading about epaxos recently and want to know when do you do the condition
check in your implementation?

It would have to be when the instance is executed.

> EPaxos
> ------
>
>                 Key: CASSANDRA-6246
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Blake Eggleston
>            Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is that Multi-paxos
requires leader election and hence, a period of unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, (2) is particularly
useful across multiple datacenters, and (3) allows any node to act as coordinator: http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to implement
it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message