incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Hendry <dan.hendry.j...@gmail.com>
Subject Re: Propose new ConsistencyLevel.ALL_AVAIL for reads
Date Fri, 17 Jun 2011 03:36:49 GMT
"Help me out here.  I'm trying to visualize a situation where the clients
can access all the C* nodes but the nodes can't access each other.  I don't
see how that can happen on a regular ethernet subnet in one data center.
 Well, I"m sure there is a case that you can point out.  Ok, I will concede
that this is an issue for some network configurations."

First rule of designing/developing/operating distributed systems: assume
anything and everything can and will happen, regardless of network
configuration or hardware.

This specific situation actually HAS happened to me. Our Cassandra nodes
accept client connections on one ethernet interface on one network (the
production network) yet communicate with each other on a separate ethernet
interface on a separate network which is Cassandra specific. This was done
mainly due to the relatively large inter-node Cassandra bandwidth
requirements in comparison to client bandwidth requirements. At one point,
the switch for the cassandra network went down so clients could connect yet
the cassandra nodes could not talk to eachother. (We write at ONE and read
at ALL so everything behaved as expected).


On Thu, Jun 16, 2011 at 11:00 PM, AJ <aj@dude.podzone.net> wrote:

> On 6/16/2011 7:56 PM, Dan Hendry wrote:
>
>> How would your solution deal with complete network partitions? A node
>> being 'down' does not actually mean it is dead, just that it is unreachable
>> from whatever is making the decision to mark it 'down'.
>>
>> Following from Ryan's example, consider nodes A, B, and C but within a
>> fully partitioned network: all of the nodes are up but each thinks all the
>> others are down. Your ALL_AVAILABLE consistency level would boil down to
>> consistency level ONE for clients connecting to any of the nodes. If I
>> connect to A, it thinks it is the last one standing and translates
>> 'ALL_AVALIABLE' into 'ONE'. Based on your logic, two clients connecting to
>> two different nodes could each modify a value then read it, thinking that
>> its 100% consistent yet it is actually *completely* inconsistent with the
>> value on other node(s).
>>
>
> Help me out here.  I'm trying to visualize a situation where the clients
> can access all the C* nodes but the nodes can't access each other.  I don't
> see how that can happen on a regular ethernet subnet in one data center.
>  Well, I"m sure there is a case that you can point out.  Ok, I will concede
> that this is an issue for some network configurations.
>
>
>  I suggest you review the principles of the infamous CAP theorem. The
>> consistency levels as the stand now, allow for an explicit trade off between
>> 'available and partition tolerant' (ONE read/write) OR 'consistent and
>> available' (QUORUM read/write). Your solution achieves only availability and
>> can guarantee neither consistency nor partition tolerance.
>>
>
> It looks like CAP may triumph again.  Thanks for the exercise Dan and Ryan.
>

Mime
View raw message