incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Parag Patel <>
Subject RE: Read query slows down when a node goes down
Date Mon, 16 Sep 2013 17:15:35 GMT
Thanks.  I've noticed that a repair takes a long to time to finish.  My data is quite small,
1.5GB on each node when running nodetool status.  Is there anyway to speed up repairs? (FYI,
I haven't actually seen a repair finish since it didn't retrun after 10 mins - I figured I
was doing something wrong).

From: sankalp kohli []
Sent: Monday, September 16, 2013 1:10 PM
Subject: Re: Read query slows down when a node goes down

For how long does the read latencies go up once a machine is down? It takes a configurable
amount of time for machines to detect that another machine is down. This is done through Gossip.
The algo to detect failures is The Phi accrual failure detector.

Regarding your question, if you are bootstrapping then it need to get the data from other
nodes and during this time, it will not serve any reads but will accept writes. Once it has
all the data, it will start serving reads. In the logs it will have something like "now serving
If you are bringing back a machine which is offline, then it will start accepting reads and
writes immediately but then you should run a repair to get the missing data.

On Mon, Sep 16, 2013 at 8:12 AM, Parag Patel <<>>
RF=3.  Single dc deployment.  No v-nodes.

Is there a certain amount of time I need to wait from the time the down node is started to
the point where it's ready to be used?  If so, what's that time?  If it's dynamic, how would
I know when it's ready?


From: sankalp kohli [<>]
Sent: Sunday, September 15, 2013 4:52 PM
Subject: Re: Read query slows down when a node goes down

What is your replication factor? DO you have multi-DC deployment? Also are u using v nodes?

On Sun, Sep 15, 2013 at 7:54 AM, Parag Patel <<>>

We have a six node cluster running DataStax Community Edition 1.2.9.  From our app, we use
the Netflix Astyanax library to read and write records into our cluster.  We read and write
with QUARUM.  We're experiencing an issue where when a node goes down, we see our read queries
slowing down in our app whenever a node goes offline.  This is a problem that is very reproducible.
 Has anybody experienced this before or do people have suggestions on what I could try?


View raw message