cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Svihla <rsvi...@datastax.com>
Subject Re: Multi DC informations (sync)
Date Fri, 19 Dec 2014 19:27:10 GMT
replies inline

On Fri, Dec 19, 2014 at 10:30 AM, Alain RODRIGUEZ <arodrime@gmail.com>
wrote:
>
> All that you said match the idea I had of how it works except this part:
>
> "The request blocks however until all CL is satisfied" --> Does this mean
> that the client will see an error if the local DC write the data correctly
> (i.e. CL reached) but the remote DC fails ? This is not the idea I had of
> something asynchronous...
>


Asynchronous is just all requests are sent out at once..the client response
is blocked till CL is satisfied or timeout occurs.

If CL is one for example..the first response back will be a "success" on
the client..regardless of what's happened in the background. If it's say
ALL..then yes it'd wait for all responses to come back.


>
> If it doesn't fail on client side (real asynchronous), is there a way to
> make sure remote DC has indeed received the information ? I mean if the
> throughput cross regions is to small, the write will fail and so will the
> HH, potentially. How to detect we are lacking of throughput cross DC for
> example ?
>
monitoring logging, etc, etc, etc

If an application needs EACH_QUORUM consistency across all data centers and
the performance penalty is worthwhile..then that's probably what you're
asking for. If LOCAL_QUORUM + regular repairs is fine..then do that..if CL
ONE is fine then do that.

You SHOULD BE monitoring dropped mutations and Hints via JMX or something
like Opscenter. Outages of substantial length should probably involve a
repair, if it's over your HH timeout, it DEFINITELY should involve a
repair. If you ever have a doubt it should involve repair.


>
> Repairs are indeed a good thing (we run them as a weekly routine, GC grace
> period 10 sec), but having inconsistency for a week without knowing it is
> quite an issue.
>

Then use a higher consistency level so that the client is not surprised,
and knows the state of things, and doesn't consider a write successful
until it's consistent across the data centers (i'd argue this is probably
not what you really want, but different applications have different needs).
If you need only local data center level awareness doing LOCAL_QUORUM reads
and writes will get you to where you want, but complete multidatacenter
nearly immediate consistency that you know about on the client is not free,
and it isn't with any system.



>
> Thanks for this detailed information Ryan, I hope I am clear enough while
> expressing my doubts.
>
>
I think it's a bit of a misunderstanding of the tools available. If you
have a need for full nearly immediate data center consistency, my
suggestion is a sizing (from a network pipe and application design SLA
perspective) for a higher CL on writes and potentially reads, the tools are
there.



> C*heers
>
> Alain
>
> 2014-12-19 15:43 GMT+01:00 Ryan Svihla <rsvihla@datastax.com>:
>>
>> More accurately,the write path of Cassandra in a multi dc sense is kinda
>> like the following
>>
>> 1. write goes to a node which acts as coordinator
>> 2. writes go out to all replicas in that DC, and then one write per
>> remote DC goes out to another node which takes responsibility for writing
>> to all replicas in it's data center. The request blocks however until all
>> CL is satisfied.
>> 3. if any of these writes fail by default a hinted handoff is generated..
>>
>> So as you can see..there is effectively not "lag" beyond either raw
>> network latency+node speed and/or just failed writes and waiting on hint
>> replay to occur. Likewise repairs can be used to make the data centers back
>> in sync, and in the case of substantial outages you will need repairs to
>> bring you back in sync, you're running repairs already right?
>>
>> Think of Cassandra as a global write, and not a message queue, and you've
>> got the basic idea.
>>
>>
>> On Fri, Dec 19, 2014 at 7:54 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>> wrote:
>>
>>> Hi Jens, thanks for your insight.
>>>
>>> Replication lag in Cassandra terms is probably “Hinted handoff” --> Well
>>> I think hinted handoff are only used when a node is down, and are not even
>>> mandatory enabled. I guess that cross DC async replication is something
>>> else, taht has nothing to see with hinted handoff, am I wrong ?
>>>
>>> `nodetool status` is your friend. It will tell you whether the cluster
>>> considers other nodes reachable or not. Run it on a node in the datacenter
>>> that you’d like to test connectivity from. --> Connectivity ≠ write success
>>>
>>> Basically the two question can be changed this way:
>>>
>>> 1 - How to monitor the async cross dc write latency ?
>>> 2 - What error should I look for when async write fails (if any) ? Or is
>>> there any other way to see that network throughput (for example) is too
>>> small for a given traffic.
>>>
>>> Hope this is clearer.
>>>
>>> C*heers,
>>>
>>> Alain
>>>
>>> 2014-12-19 11:44 GMT+01:00 Jens Rantil <jens.rantil@tink.se>:
>>>>
>>>> Alain,
>>>>
>>>> AFAIK, the DC replication is not linearizable. That is, writes are are
>>>> not replicated according to a binlog or similar like MySQL. They are
>>>> replicated concurrently.
>>>>
>>>> To answer you questions:
>>>> 1 - Replication lag in Cassandra terms is probably “Hinted handoff”.
>>>> You’d want to check the status of that.
>>>> 2 - `nodetool status` is your friend. It will tell you whether the
>>>> cluster considers other nodes reachable or not. Run it on a node in the
>>>> datacenter that you’d like to test connectivity from.
>>>>
>>>> Cheers,
>>>> Jens
>>>>
>>>> ——— Jens Rantil Backend engineer Tink AB Email: jens.rantil@tink.se
>>>> Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter
>>>>
>>>>
>>>> On Fri, Dec 19, 2014 at 11:16 AM, Alain RODRIGUEZ <arodrime@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> We expanded our cluster to a multiple DC configuration.
>>>>>
>>>>> Now I am wondering if there is any way to know:
>>>>>
>>>>> 1 - The replication lag between these 2 DC (Opscenter, nodetool, other
>>>>> ?)
>>>>> 2 - Make sure that sync is ok at any time
>>>>>
>>>>> I guess big companies running Cassandra are interested in these kind
>>>>> of info, so I think something exist but I am not aware of it.
>>>>>
>>>>> Any other important information or advice you can give me about best
>>>>> practices or tricks while running a multi DC (cross regions US <->
EU) is
>>>>> welcome of course !
>>>>>
>>>>> cheers,
>>>>>
>>>>> Alain
>>>>>
>>>>
>>>>
>>
>>
>> --
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Ryan Svihla
>>
>> Solution Architect
>>
>> [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png]
>> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>>

-- 

[image: datastax_logo.png] <http://www.datastax.com/>

Ryan Svihla

Solution Architect

[image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png]
<http://www.linkedin.com/pub/ryan-svihla/12/621/727/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

Mime
View raw message