cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Multi DC informations (sync)
Date Fri, 19 Dec 2014 16:30:27 GMT
All that you said match the idea I had of how it works except this part:

"The request blocks however until all CL is satisfied" --> Does this mean
that the client will see an error if the local DC write the data correctly
(i.e. CL reached) but the remote DC fails ? This is not the idea I had of
something asynchronous...

If it doesn't fail on client side (real asynchronous), is there a way to
make sure remote DC has indeed received the information ? I mean if the
throughput cross regions is to small, the write will fail and so will the
HH, potentially. How to detect we are lacking of throughput cross DC for
example ?

Repairs are indeed a good thing (we run them as a weekly routine, GC grace
period 10 sec), but having inconsistency for a week without knowing it is
quite an issue.

Thanks for this detailed information Ryan, I hope I am clear enough while
expressing my doubts.

C*heers

Alain

2014-12-19 15:43 GMT+01:00 Ryan Svihla <rsvihla@datastax.com>:
>
> More accurately,the write path of Cassandra in a multi dc sense is kinda
> like the following
>
> 1. write goes to a node which acts as coordinator
> 2. writes go out to all replicas in that DC, and then one write per remote
> DC goes out to another node which takes responsibility for writing to all
> replicas in it's data center. The request blocks however until all CL is
> satisfied.
> 3. if any of these writes fail by default a hinted handoff is generated..
>
> So as you can see..there is effectively not "lag" beyond either raw
> network latency+node speed and/or just failed writes and waiting on hint
> replay to occur. Likewise repairs can be used to make the data centers back
> in sync, and in the case of substantial outages you will need repairs to
> bring you back in sync, you're running repairs already right?
>
> Think of Cassandra as a global write, and not a message queue, and you've
> got the basic idea.
>
>
> On Fri, Dec 19, 2014 at 7:54 AM, Alain RODRIGUEZ <arodrime@gmail.com>
> wrote:
>
>> Hi Jens, thanks for your insight.
>>
>> Replication lag in Cassandra terms is probably “Hinted handoff” --> Well
>> I think hinted handoff are only used when a node is down, and are not even
>> mandatory enabled. I guess that cross DC async replication is something
>> else, taht has nothing to see with hinted handoff, am I wrong ?
>>
>> `nodetool status` is your friend. It will tell you whether the cluster
>> considers other nodes reachable or not. Run it on a node in the datacenter
>> that you’d like to test connectivity from. --> Connectivity ≠ write success
>>
>> Basically the two question can be changed this way:
>>
>> 1 - How to monitor the async cross dc write latency ?
>> 2 - What error should I look for when async write fails (if any) ? Or is
>> there any other way to see that network throughput (for example) is too
>> small for a given traffic.
>>
>> Hope this is clearer.
>>
>> C*heers,
>>
>> Alain
>>
>> 2014-12-19 11:44 GMT+01:00 Jens Rantil <jens.rantil@tink.se>:
>>>
>>> Alain,
>>>
>>> AFAIK, the DC replication is not linearizable. That is, writes are are
>>> not replicated according to a binlog or similar like MySQL. They are
>>> replicated concurrently.
>>>
>>> To answer you questions:
>>> 1 - Replication lag in Cassandra terms is probably “Hinted handoff”.
>>> You’d want to check the status of that.
>>> 2 - `nodetool status` is your friend. It will tell you whether the
>>> cluster considers other nodes reachable or not. Run it on a node in the
>>> datacenter that you’d like to test connectivity from.
>>>
>>> Cheers,
>>> Jens
>>>
>>> ——— Jens Rantil Backend engineer Tink AB Email: jens.rantil@tink.se
>>> Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter
>>>
>>>
>>> On Fri, Dec 19, 2014 at 11:16 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>>> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> We expanded our cluster to a multiple DC configuration.
>>>>
>>>> Now I am wondering if there is any way to know:
>>>>
>>>> 1 - The replication lag between these 2 DC (Opscenter, nodetool, other
>>>> ?)
>>>> 2 - Make sure that sync is ok at any time
>>>>
>>>> I guess big companies running Cassandra are interested in these kind of
>>>> info, so I think something exist but I am not aware of it.
>>>>
>>>> Any other important information or advice you can give me about best
>>>> practices or tricks while running a multi DC (cross regions US <->
EU) is
>>>> welcome of course !
>>>>
>>>> cheers,
>>>>
>>>> Alain
>>>>
>>>
>>>
>
>
> --
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Ryan Svihla
>
> Solution Architect
>
> [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png]
> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
>

Mime
View raw message