incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasileios Vlachos <vasileiosvlac...@gmail.com>
Subject Re: Multi-DC Environment Question
Date Tue, 03 Jun 2014 22:03:51 GMT
Thanks for your responses!

Matt, I did a test with 4 nodes, 2 in each DC and the answer appears to 
be yes. The tokens seem to be unique across the entire cluster, not just 
on a per DC basis. I don't know if the number of nodes deployed is 
enough to reassure me, but this is my conclusion for now. Please, 
correct me if you know I'm wrong.

Rob, this is the plan of attack I have in mind now. Although, in case of 
a catastrophic failure of a DC, the downtime is usually longer than 
that. So it's either less than the default value (when testing that the 
DR works for example) or more (actually using the DR as primary DC). 
Based on that, the default seems reasonable to me.

I also found that nodetool repair can be performed on one DC only by 
specifying the --in-local-dc option. So, presumably the classic nodetool 
repair applies to the entire cluster (sounds obvious, but is that 
actually correct?).

Question 3 in my previous email still remains unanswered to me... I 
cannot find out if there is only one hint stored in the coordinator 
irrespective of number of replicas being down, and also if the hint is 
100% of the size of the original write request.

Thanks,

Vasilis

On 03/06/14 18:52, Robert Coli wrote:
> On Fri, May 30, 2014 at 4:08 AM, Vasileios Vlachos 
> <vasileiosvlachos@gmail.com <mailto:vasileiosvlachos@gmail.com>> wrote:
>
>     Basically you sort of confirmed that if down_time >
>     max_hint_window_in_ms the only way to bring DC1 up-to-date is
>     anti-entropy repair.
>
>
>     Also, read repair does not help either as we assumed that
>     down_time > max_hint_window_in_ms. Please correct me if I am wrong.
>
>
> My understanding is that if you :
>
> 1) set read repair chance to 100%
> 2) read all keys in the keyspace with a client
>
> You would accomplish the same increase in consistency as you would by 
> running repair.
>
> In cases where this may matter, and your system can handle delivering 
> the hints, increasing the already-increased-from-old-default-of-1-hour 
> current default of 3 hours to 6 or more hours gives operators more 
> time to work in the case of partition or failure. Note that hints are 
> only an optimization, only repair (and read repair at 100%, I think..) 
> assert any guarantee of consistency.
>
> =Rob
>

-- 
Kind Regards,

Vasileios Vlachos


Mime
View raw message