cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Riyad Kalla <rka...@gmail.com>
Subject Re: increased RF and repair, not working?
Date Fri, 27 Jul 2012 20:29:16 GMT
Dave,

What I was suggesting for Yan was to:

WRITE: RF=2, CL=QUORUM
READ: CL=ONE

But you have a good pt... if he hits one of the replicas that didn't have
the data, that would be bad.

Thanks for clearing that up.

On Fri, Jul 27, 2012 at 11:43 AM, Dave Brosius <dbrosius@mebigfatguy.com>wrote:

> You have RF=2, CL= Quorum but 3 nodes.
>
> So each row is represented on 2 of the 3 nodes.
>
> If you take a node down, one of two things can happen when you attempt to
> read a row.
>
> The row lives on the two nodes that are still up. In this case you will
> successfully read the data.
>
> The row lives on one node that is up, and one node that is down. In this
> case the read will fail because you haven't fulfilled the quorum (2 nodes
> in agreement) requirement.
>
>
> *----- Original Message -----*
> *From:* "Riyad Kalla" <rkalla@gmail.com>
> *Sent:* Fri, July 27, 2012 8:08
> *Subject:* Re: increased RF and repair, not working?
>
> Dave, per my understanding of Yan's description he has 3 nodes and took
> one down manually to test; that should have worked, no?
>
> On Thu, Jul 26, 2012 at 11:00 PM, Dave Brosius <dbrosius@mebigfatguy.com>wrote:
>
>> Quorum is defined as
>>
>> (replication_factor / 2) + 1
>> therefore quorum when rf = 2 is 2! so in your case, both nodes must be up.  Really,
using Quorum only starts making sense as a 'quorum' when RF=3
>>
>>
>>
>>
>>
>>
>> On 07/26/2012 10:38 PM, Yan Chunlu wrote:
>>
>> I am using Cassandra 1.0.2, have a 3 nodes cluster. the consistency level
>> of read & write are  both QUORUM.
>>
>> At first the RF=1, and I figured that one node down will cause the
>> cluster unusable. so I changed RF to 2, and run nodetool repair on every
>> node(actually I did it twice).
>>
>> After the operation I think my data should be in at least two nodes, and
>> it would be okay if one of them is down.
>>
>> But when I tried to simulate the failure, by disablegossip of one node,
>> and the cluster knows this node is dow n. then access data from the
>> cluster, it returned  "MaximumRetryException"(pycassa).   as my experiences
>> this is caused by "UnavailableException", which is means the data it is
>> requesting is on a node which is down.
>>
>> so I wonder my data might not be replicated right, what should I do?
>> thanks for the help!
>>
>> here is the keyspace info:
>>
>> *
>> *
>> *Keyspace: comments:*
>> *  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy*
>>  *  Durable Writes: true*
>> *    Options: [replication_factor:2]*
>>
>>
>>
>> the scheme version is okay:
>>
>> *[default@unknown] describe cluster;*
>> *Cluster Information:*
>> *   Snitch: org.apache.cassandra.locator.SimpleSnitch*
>>  *   Partitioner: org.apache.cassandra.dht.RandomPartitioner*
>> *   Schema versions: *
>> * f67d0d50-b923-11e1-0000-4f7cf9240aef: [192.168.1.129, 192.168.1.40,
>> 192.168.1.50]*
>>
>>
>>
>> the loads are as below:
>>
>> *nodetool -h localhost ring*
>> *Address         DC          Rack        Status State   Load
>> &nbsp ;Owns    Token                                       *
>> *
>>        113427455640312821154458202477256070484     *
>> *192.168.1.50    datacenter1 rack1       Up     Normal  28.77 GB
>>  33.33%  0                                           *
>> *192.168.1.40    datacenter1 rac k1       Up     Normal  26.67 GB
>>  33.33%  56713727820156410577229101238628035242      *
>> *192.168.1.129   datacenter1 rack1       Up     Normal  33.25 GB
>>  33.33%  113427455640312821154458202477256070484    *
>>
>>
>

Mime
View raw message