cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8352) Timeout Exception on Node Failure in Remote Data Center
Date Wed, 26 Nov 2014 06:23:12 GMT


Jonathan Ellis commented on CASSANDRA-8352:

Here's how this works:

You test the new version to make sure it's something we haven't fixed already.  Then we write
a fix for the next new version.

Please don't reopen until you've done that.

> Timeout Exception on Node Failure in Remote Data Center
> -------------------------------------------------------
>                 Key: CASSANDRA-8352
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Unix, Cassandra 2.0.3
>            Reporter: Akhtar Hussain
>              Labels: DataCenter, GEO-Red
> We have a Geo-red setup with 2 Data centers having 3 nodes each. When we bring down a
single Cassandra node down in DC2 by kill -9 <Cassandra-pid>, reads fail on DC1 with
TimedOutException for a brief amount of time (15-20 sec~). 
> Questions:
> 1.	We need to understand why reads fail on DC1 when a node in another DC i.e. DC2 fails?
As we are using LOCAL_QUORUM for both reads/writes in DC1, request should return once 2 nodes
in local DC have replied instead of timing out because of node in remote DC.
> 2.	We want to make sure that no Cassandra requests fail in case of node failures. We
used rapid read protection of ALWAYS/99percentile/10ms as mentioned in
But nothing worked. How to ensure zero request failures in case a node fails?
> 3.	What is the right way of handling HTimedOutException exceptions in Hector?
> 4.	Please confirm are we using public private hostnames as expected?
> We are using Cassandra 2.0.3.

This message was sent by Atlassian JIRA

View raw message