cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <SEAN_R_DUR...@homedepot.com>
Subject RE: Cassandra Data Loss
Date Fri, 10 Apr 2015 17:53:50 GMT
Running repair across the ring within gc_grace_seconds (default 10 days) is considered normal,
required maintenance. If you run full repair (not repair -pr), that will make sure replicas
are on the proper nodes in addition to “owned” tokens. Don’t run all repairs at the
same time, especially if your cluster is already under considerable load.


Sean Durity

From: Pranay Agarwal [mailto:agarwalpranaya@gmail.com]
Sent: Friday, April 10, 2015 1:17 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Data Loss

Thanks Anuj and Jens.

So, my initial assumption is correct that cassandra will *attempt* to replicate data irrespective
of the CL value. However, if those asynchronous calls to replicate is failed, is there any
retrial done by cassandra? Will a full cluster wider node repair will take care this and guarantee
the *all* data is now replicated RF times?

We really don't care about reading stale data that much, but we really want the data to be
guaranteed to be replicated 3 times, so that we don't loose the data even if 2 nodes fail.
We are doing this heavy write/read as initial import and ideally I would like keep CL 1 so
that client is not blocked but at the same time we want cassandra to take care in background
or asynchronously and ensure replication.

On Fri, Apr 10, 2015 at 1:02 AM, Jens Rantil <jens.rantil@tink.se<mailto:jens.rantil@tink.se>>
wrote:
Somewhat related: http://wiki.apache.org/cassandra/ReadRepair states

Range scans are not per-key and do not do read repair.

Does "key" in "per-key" refer to "partition key" or "partition+clustering key"?

Cheers,
Jens

On Fri, Apr 10, 2015 at 3:47 AM, Anuj Wadehra <anujw_2003@yahoo.co.in<mailto:anujw_2003@yahoo.co.in>>
wrote:
Read repair and repair run as part of maintenance will make it consistent. Read repair is
usually done on only 10% of reads. You can tune tune read_repair_chance property of cf to
adjust that. Till a row is repaired clients may return stale data if cl=1 is used for reads.
I would suggest that u should minimize dropping of mutations by tuning if thats the case rather
than fixing it.

Thanks
Anuj Wadehra

Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________
From:"Kurtis vel" <kurtisvelarde@gmail.com<mailto:kurtisvelarde@gmail.com>>
Date:Fri, 10 Apr, 2015 at 7:09 am
Subject:Re: Cassandra Data Loss
Hi Anuj,

Assuming cl=1 and rf=3.

Will the data ever be consistent if an asynchronous replication call fails?

Is this where read repair comes in handy?
thanks

On Thu, Apr 9, 2015 at 6:24 PM, Anuj Wadehra <anujw_2003@yahoo.co.in<mailto:anujw_2003@yahoo.co.in>>
wrote:
Cl=1 means that client will only block for one response. In case of writes other 2 replicas
will be updated asynchronously and eventually updated. As you are running heavy load make
sure that writes /mutations are not getting dropped using nodetool tpstats on all nodes. Under
heavy loads Cassandra may drop writes and as these were asynchronous,client wont know about
that.

if cl=1 for both reads and writes. Some reads may return stale data.If you need absolute guarantee
that reads always return up to date data go for strong consistency r cf + w cf greater than
rf. Eg read at quorum and write at quorum.

Thanks
Anuj Wadehra

Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________
From:"Pranay Agarwal" <agarwalpranaya@gmail.com<mailto:agarwalpranaya@gmail.com>>
Date:Fri, 10 Apr, 2015 at 6:40 am
Subject:Cassandra Data Loss
Hi All.


I am using 20 nodes cassandra cluster with RF=3 and CL=1. We are doing very write/read heavy
operations (total 100k ops/sec).

I have been assuming all along that all the data will be replicated in 3 different place irrespective
of consistency level as it's a very application/driver level config. Is that correct or Cassandra
guarantees 3 replica only when I also have CL as 3 as well?


Thanks
-Pranay







--
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se<mailto:jens.rantil@tink.se>
Phone: +46 708 84 18 32
Web: www.tink.se<http://www.tink.se/>

Facebook<https://www.facebook.com/#!/tink.se> Linkedin<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
Twitter<https://twitter.com/tink>


________________________________

The information in this Internet Email is confidential and may be legally privileged. It is
intended solely for the addressee. Access to this Email by anyone else is unauthorized. If
you are not the intended recipient, any disclosure, copying, distribution or any action taken
or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed
to our clients any opinions or advice contained in this Email are subject to the terms and
conditions expressed in any applicable governing The Home Depot terms of business or client
engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy
and content of this attachment and for any damages or losses arising from any inaccuracies,
errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature,
which may be contained in this attachment and shall not be liable for direct, indirect, consequential
or special damages in connection with this e-mail message or its attachment.
Mime
View raw message