cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shannon Carey <>
Subject Re: Consistency Level vs. Retry Policy when no local nodes are available
Date Tue, 21 Mar 2017 16:06:15 GMT
Thanks for the perspective Ben, it's food for thought.

At minimum, it seems like the documentation should be updated to mention that the retry policy
will not be consulted when using a local consistency level but with no local nodes available.
That way, people won't be surprised by it. It looks like the docs are included in the Github
repo, so I guess I'll try to contribute an update there.

From: Ben Slater <<>>
Reply-To: "<>" <<>>
Date: Monday, March 20, 2017 at 6:25 PM
To: "<>" <<>>
Subject: Re: Consistency Level vs. Retry Policy when no local nodes are available

I think the general assumption is that DC failover happens at the client app level rather
than the Cassandra level due to the potentially very significant difference in request latency
if you move from a app-local DC to a remote DC. The preferred pattern for most people is that
the app fails in a failed  DC and some load balancer above the app redirects traffic to a
different DC.

The other factor is that the fail-back scenario from a failed DC and LOCAL_* consistencies
is potentially complex. Do you want to immediately start using the new DC when it becomes
available (with missing data) or wait until it catches up on writes (and how do you know when
that has happened)?

Note also QUORUM is a clear majority of replicas across both DCs. Some people run 3 DCs with
RF 3 in each and QUORUM to maintain strong consistency across DCs even with DC failure.


On Tue, 21 Mar 2017 at 10:00 Shannon Carey <<>>
Specifically, this puts us in an awkward position because LOCAL_QUORUM is desirable so that
we don't have unnecessary cross-DC traffic from the client by default, but we can't use it
because it will cause complete failure if the local DC goes down. And we can't use QUORUM
because it would fail if there's not a quorum in either DC (as would happen if one DC goes
down). So it seems like we are forced to use a lesser consistency such as ONE or TWO.


From: Shannon Carey <<>>
Date: Monday, March 20, 2017 at 5:25 PM
To: "<>" <<>>
Subject: Consistency Level vs. Retry Policy when no local nodes are available

I am running DSE 5.0, and I have a Java client using the Datastax 3.0.0 client library.

The client is configured to use a DCAwareRoundRobinPolicy wrapped in a TokenAwarePolicy. Nothing

When I run my query, I set a custom retry policy.

I am testing cross-DC failover. I have disabled connectivity to the "local" DC (relative to
my client) in order to perform the test. When I run a query with the first consistency level
set to LOCAL_ONE (or local anything), my retry policy is never called and I always get this
"com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query
failed (no host was tried)"

getErrors() on the exception is empty.

This is contrary to my expectation that the first attempt would fail and would allow my RetryPolicy
to attempt a different (non-LOCAL) consistency level. I have no choice but to avoid using
any kind of LOCAL consistency level throughout my applications. Is this expected? Or is there
anything I can do about it? Thanks! It certainly seems like a bug to me or at least something
that should be improved.


Ben Slater
Chief Product Officer

 [] <>
  [] <>

Read our latest technical blog posts here<>.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and Instaclustr
Inc (USA).

This email and any attachments may contain confidential and legally privileged information.
 If you are not the intended recipient, do not copy or disclose its content, but please reply
to this email immediately and highlight the error to the sender and then immediately delete
the message.
View raw message