cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3901) write endpoints are not treated correctly, breaking consistency guarantees
Date Sun, 26 Feb 2012 02:53:48 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216616#comment-13216616
] 

Peter Schuller commented on CASSANDRA-3901:
-------------------------------------------

When fixing this, we must consider saved endpoints. My reading now is that saved endpoints
only constitute fully joined nodes, and when starting up we'd be potentially serving thrift
traffic before becoming aware of bootstrapping nodes that imply pending ranges. (This relates
to the general problem of waiting to serve thrift traffic before reaching a point of believing
we understand the ring state, which we currently don't, and can result in e.g. massive amounts
of timeouts when going up on thrift prior to gossip kicking hosts up.) Even if we fix the
write path for this ticket, the fix won't be complete without actually ensuring the write
path has a consistent ring state too.
                
> write endpoints are not treated correctly, breaking consistency guarantees
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3901
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3901
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>            Priority: Critical
>
> I had a nagging feeling this was the case ever since I started wanting CASSANDRA-3833
and thinking about hot to handle the association between nodes in the read set and nodes in
the write set.
> I may be wrong (please point me in the direct direction if so), but I see no code anywhere
that tries to (1) apply consistency level to currently normal endpoints only, and (2) "connect"
a given read endpoint with a future write endpoint such that they are tied together for consistency
purposes (parts of these concerns probably is covered by CASSANDRA-2434 but that ticket is
more general).
> To be more clear about the problem: Suppose we have a ring of nodes, with a single node
bootstrapping. Now, for a given row key suppose reads are served by A, B and C while writes
are to go to A, B, C and D. In other words, D is the node bootstrapping. Suppose RF is 3 and
A,B,C,D is ring order. There are a few things required for correct behavior:
> * Writes acked by D must never be treated as sufficient to satisfy consistency level
since until it is part of the read set it does not count towards CL on reads.
> * Writes acked by B must *not* be treated as sufficient to satisfy consistency level
*unless* the same write is *also* acked by D, because once D enters the ring, B will no longer
be counting towards CL on reads. The only alternative is to make the read succeed and disallow
D from entering the ring.
> We don't seem to be handling this at all (and it becomes more complicated with arbitrary
transitions).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message