cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandeep Tata (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-225) Support mastered writes
Date Fri, 12 Jun 2009 21:35:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718984#action_12718984
] 

Sandeep Tata edited comment on CASSANDRA-225 at 6/12/09 2:33 PM:
-----------------------------------------------------------------

Okay this is an ugly first cut, but I want to put it out there so you guys have a chance to
provide comments on the design as I hack up the rest of this feature. 

Basic idea --- calls that go to the primary (currently defined as the first endpoint in the
list) are applied locally then asynchronously sent to the other replicas. If the node is not
the primary, it forwards the request to the primary and waits for a response before acking.

1. I didn't add a whole bunch of calls in the interface -- I stole the block=1 values to mean
mastered writes for now. This is not unreasonable since even non-blocking writes give you
a read-your-writes semantics. block=1 doesn't really mean much right now. Of course, this
is not clean, and I expect to change it once we're happy to expose this in the interface.
You need to turn on "MasteredUpdatesForBlockOne" in the conf file to use it.

2. This does not (yet) work in the presence of failures. It is possible that some failure
scenarios lead to a state where 2 nodes both think they're "masters". The easiest way to solve
this is using a safe leader-election algorithm using Zookeeper. That'll have to be in round
2 of the patch.

Of course, if you don't turn on MasteredUpdatesForBlockOne, you never touch this code path.



      was (Author: sandeep_tata):
    Okay this is an ugly first cut, but I want to put it out there so you guys have a chance
to provide comments on the design as I hack up the rest of this feature. 

Basic idea --- calls that go to the primary (currently defined as the first endpoint in the
list) are applied locally then asynchronously sent to the other replicas. If the node is not
the primary, it forwards the request to the primary and waits for a response before acking.

1. I didn't add a whole bunch of calls in the interface -- I stole the block=1 values to mean
mastered writes for now. This is not unreasonable since even non-blocking writes give you
a read-your-writes semantics. block=1 doesn't really mean much right now. Of course, this
is not clean, and I expect to change it once we're happy to expose this in the interface.
You need to turn on "UseMasteredWritesForBlockOne" in the conf file to use it.

2. This does not (yet) work in the presence of failures. It is possible that some failure
scenarios lead to a state where 2 nodes both think they're "masters". The easiest way to solve
this is using a safe leader-election algorithm using Zookeeper. That'll have to be in round
2 of the patch.

Of course, if you don't turn on UseMasteredWritesForBlockOne, you never touch this code path.


  
> Support mastered writes
> -----------------------
>
>                 Key: CASSANDRA-225
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-225
>             Project: Cassandra
>          Issue Type: New Feature
>    Affects Versions: 0.4
>         Environment: all
>            Reporter: Sandeep Tata
>            Assignee: Sandeep Tata
>             Fix For: 0.4
>
>         Attachments: 225.patch
>
>
> Writes to a row today can be run on any of the replicas that own the row. An additional
set of APIs to perform "mastered" writes that funnel through a primary is important if applications
have some operations that require higher consistency. Test-and-set is an example of one such
operation that requires a higher consistency guarantee.
> To stay true to Cassandra's performance goals, this should be done in a way that does
not compromise performance for apps that can deal with lower consistency and never use these
APIs. That said, an app that mixes higher consistency calls with lower consistency calls should
be careful that they don't operate on the same data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message