cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thakrar, Jayesh" <jthak...@conversantmedia.com>
Subject Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?
Date Fri, 28 Apr 2017 05:06:10 GMT
Thanks for the explanation, Alain - very helpful!

From: Alain RODRIGUEZ <arodrime@gmail.com>
Date: Thursday, April 27, 2017 at 6:12 AM
To: <user@cassandra.apache.org>
Subject: Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

It happened to me in the future in a bad way, and nothing prevent it from happening in the
future

Obviously "It happened to me in the past in a bad way"*. Thinking faster than I write... I
am quite slow writing :p.

To be clear I recommend:

  *   to run repairs within gc_grace_seconds when performing deletes (not TTL, TTLs are fine)
  *   to run repairs 'regularly' when not deleting data (depending on data size and CL in
use)
Hope that helps,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com<mailto:alain@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-27 13:07 GMT+02:00 Alain RODRIGUEZ <arodrime@gmail.com<mailto:arodrime@gmail.com>>:
Hi,

To put it easy, I have been taught that anything that can be disabled is an optimization.
So we don't want to rely an optimization that can silently fail. This goes for read repair
as well as we cannot be sure that all the data will be read. Plus it is configured to trigger
only 10 % of the time by default, and not cross data center.

(Anti-entropy) Repairs are known to be be necessary to make sure data is correctly distributed
on all the nodes that are supposed to have it.

As Cassandra is built to allow native tolerance to failure (when correctly configured to do
so), it can happen that a node miss a data, by design.

When this data that miss a node was a tombstone due to a delete, it needs to be replicated
before all the other nodes remove it, which happen eventually after 'gc_grace_seconds' (detailed
post about this thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html<http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html>).
If this tombstone is removed from all the nodes before having been replicated to the node
that missed it, this node will eventually replicate the data that should have been deleted,
the data overridden by the tombstone. We call it a zombie.

And hinted handoff can and will fail. It happened to me in the future in a bad way, and nothing
prevent it from happening in the future, even if they were greatly imrpoved in 3.0+.

From Datastax doc<https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff.html>:
"Hints are flushed to disk every 10 seconds, reducing the staleness of the hints."

Which means that, by design, a node going down can lose up to 10 seconds of hints stored for
other nodes (in which some might be deletes).

The conclusion is often the same one, if not running deletes or if zombie data is not an issue,
it is quite safe not to run repair within 'gc_grace_seconds' (default 10 days). But this is
the only way to ensure a low entropy for regular data (not only tombstones) in a Cassandra
cluster as of now all other optimizations can and will fail at some point. It also provides
a better consistency, if reading with a weak consistency level such as LOCAL_ONE, as it will
reduce entropy, chance to read the same data everywhere increases.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com<mailto:alain@thelastpickle.com>
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh <jthakrar@conversantmedia.com<mailto:jthakrar@conversantmedia.com>>:

Unfortunately, I don’t know much about the replication architecture.
The only thing I know is that the replication is set at the keyspace level (i.e. 1, 2 or 3
or N replicas) and then
there is the consistency level set at the client application level which determines how many
acknowledgements
are necessary to deem a write successful.

And you might have noticed in the video that anti-entropy is to be done as "deemed" necessary
and not to be done blindly as a rule.
E.g. if your data is read-only (never mutated) then there is no need for anti-entropy.

From: eugene miretsky <eugene.miretsky@gmail.com<mailto:eugene.miretsky@gmail.com>>
Date: Thursday, April 20, 2017 at 5:52 PM
To: Conversant <jthakrar@conversantmedia.com<mailto:jthakrar@conversantmedia.com>>
Cc: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

Thanks Jayesh,

Watched all of those.

Still not sure I fully get the theory behind it

Aside from the 2 failure  cases I mentioned earlier, the only other way data can become inconsistent
 is error when replicating the data in the background. Does Cassandra have a retry policy
for internal replication? Is there a setting to change it?





On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh <jthakrar@conversantmedia.com<mailto:jthakrar@conversantmedia.com>>
wrote:
I had asked a similar/related question - on how to carry out repair, etc and got some useful
pointers.
I would highly recommend the youtube video or the slideshare link below (both are for the
same presentation).

https://www.youtube.com/watch?v=1Sz_K8UID6E

http://www.slideshare.net/DataStax/real-world-repairs-vinay-chella-netflix-cassandra-summit-2016

https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/

https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRepair.html

https://www.datastax.com/dev/blog/repair-in-cassandra




From: eugene miretsky <eugene.miretsky@gmail.com<mailto:eugene.miretsky@gmail.com>>
Date: Thursday, April 6, 2017 at 3:35 PM
To: <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

Hi,

As I see it, if hinted handoff is enabled, the only time data can be inconsistent is when:

  1.  A node is down for longer than the max_hint_window
  2.  The coordinator node crushes before all the hints have been replayed
Why is it still recommended to perform frequent automatic repairs, as well as enable read
repair? Can't I just run a repair after one of the nodes is down? The only problem I see with
this approach is a long repair job (instead of small incremental repairs). But other than
that, are there any other issues/corner-cases?

Cheers,
Eugene



Mime
View raw message