incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Baskar Duraikannu <baskar.duraika...@outlook.com>
Subject Re: Read repair
Date Fri, 01 Nov 2013 00:43:49 GMT
Yes, it helps. Thanks

--- Original Message ---

From: "Aaron Morton" <aaron@thelastpickle.com>
Sent: October 31, 2013 3:51 AM
To: "Cassandra User" <user@cassandra.apache.org>
Subject: Re: Read repair

(assuming RF 3 and NTS is putting a replica in each rack)

> Rack1 goes down and some writes happen in quorum against rack 2 and 3.
During this period (1) writes will be committed onto a node in both rack 2 and 3. Hints will
be stored on a node in either rack 2 or 3.

> After couple of hours rack1 comes back and rack2 goes down.
During this period writes from period (1) will be guaranteed to be on rack 3.

Reads at QUORUM must use a node from rack 1 and rack 3. As such the read will include the
node in rack 3 that stored the write during period (1).

> Now for rows inserted for about 1 hour and 30 mins, there is no quorum until failed rack
comes back up.
In your example there is always a QUORUM as we always had 2 or the 3 racks and so 2 of the
3 replicas for each row.

For the CL guarantee to work we just have to have one of the nodes that completed the write
be involved in the read.

Hope that helps.

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 12:32 am, Baskar Duraikannu <baskar.duraikannu@outlook.com> wrote:

> Aaron
>
> Rack1 goes down and some writes happen in quorum against rack 2 and 3. Hinted handoff
is set to 30 mins. After couple of hours rack1 comes back and rack2 goes down. Hinted handoff
will play but will not cover all of the writes because of 30 min setting. Now for rows inserted
for about 1 hour and 30 mins, there is no quorum until failed rack comes back up.
>
> Hope this explains the scenario.
> From: Aaron Morton
> Sent: ‎10/‎28/‎2013 2:42 AM
> To: Cassandra User
> Subject: Re: Read repair
>
>> As soon as it came back up, due to some human error, rack1 goes down. Now for some
rows it is possible that Quorum cannot be established.
> Not sure I follow here.
>
> if the first rack has come up I assume all nodes are available, if you then lose a different
rack I assume you have 2/3 of the nodes available and would be able to achieve a QUORUM.
>
>> Just to minimize the issues, we are thinking of running read repair manually every
night.
> If you are reading and writing at QUORUM and the cluster does not have a QUORUM of nodes
available writes will not be processed. During reads any mismatch between the data returned
from the nodes will be detected and resolved before returning to the client.
>
> Read Repair is an automatic process that reads from more nodes than necessary and resolves
the differences in the back ground.
>
> I would run nodetool repair / Anti Entropy as normal, once on every machine every gc_grace_seconds.
If you have a while rack fail for run repair on the nodes in the rack if you want to get it
back to consistency quickly. The need to do that depends on the config for Hinted Handoff,
read_repair_chance, Consistency level, the write load, and (to some degree) the number of
nodes. If you want to be extra safe just run it.
>
> Cheers
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 26/10/2013, at 2:54 pm, Baskar Duraikannu <baskar.duraikannu@outlook.com> wrote:
>
>> We are thinking through the deployment architecture for our Cassandra cluster.  Let
us say that we choose to deploy data across three racks.
>>
>> If let us say that one rack power went down for 10 mins and then it came back. As
soon as it came back up, due to some human error, rack1 goes down. Now for some rows it is
possible that Quorum cannot be established. Just to minimize the issues, we are thinking of
running read repair manually every night.
>>
>> Is this a good idea? How often do you perform read repair on your cluster?
>>
>


Mime
View raw message