incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Quorum: killing 1 out of 3 server kills the cluster (?)
Date Thu, 09 Dec 2010 16:55:20 GMT
> I naively assume that if I kill either node that holds N1 (i.e. node 1 or 3), N1 will
still remain on another node. Only if both fail, I actually lose data. But apparently this
is not how it works...

Sure, the data that N1 holds is also on another node and you won't
lose it by only losing N1.
But when you do a quorum query, you are saying to Cassandra "Please
please would you fail my request
if you can't get a response from 2 nodes". So if only 1 node holding
the data is up at the moment of the
query then Cassandra, which is a very polite software, do what you
asked and fail.
If you want Cassandra to send you an answer with only one node up, use
CL=ONE (as said by David).

>
>> On Thu, Dec 9, 2010 at 6:05 PM, Sylvain Lebresne <sylvain@yakaz.com> wrote:
>> I'ts 2 out of the number of replicas, not the number of nodes. At RF=2, you have
>> 2 replicas. And since quorum is also 2 with that replication factor,
>> you cannot lose
>> a node, otherwise some query will end up as UnavailableException.
>>
>> Again, this is not related to the total number of nodes. Even with 200
>> nodes, if
>> you use RF=2, you will have some query that fail (altough much less that what
>> you are probably seeing).
>>
>> On Thu, Dec 9, 2010 at 5:00 PM, Timo Nentwig <timo.nentwig@toptarif.de> wrote:
>> >
>> > On Dec 9, 2010, at 16:50, Daniel Lundin wrote:
>> >
>> >> Quorum is really only useful when RF > 2, since the for a quorum to
>> >> succeed RF/2+1 replicas must be available.
>> >
>> > 2/2+1==2 and I killed 1 of 3, so... don't get it.
>> >
>> >> This means for RF = 2, consistency levels QUORUM and ALL yield the same
result.
>> >>
>> >> /d
>> >>
>> >> On Thu, Dec 9, 2010 at 4:40 PM, Timo Nentwig <timo.nentwig@toptarif.de>
wrote:
>> >>> Hi!
>> >>>
>> >>> I've 3 servers running (0.7rc1) with a replication_factor of 2 and use
quorum for writes. But when I shut down one of them UnavailableExceptions are thrown. Why
is that? Isn't that the sense of quorum and a fault-tolerant DB that it continues with the
remaining 2 nodes and redistributes the data to the broken one as soons as its up again?
>> >>>
>> >>> What may I be doing wrong?
>> >>>
>> >>> thx
>> >>> tcn
>> >
>> >
>>
>
>

Mime
View raw message