cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Bialecki <>
Subject Re: Simulating a failed node
Date Mon, 29 Oct 2012 20:17:04 GMT
Thanks, extremely helpful. The key bit was I wasn't flushing the old
Keyspace before re-running the stress test, so I was stuck at RF = 1 from a
previous run despite passing RF = 2 to the stress tool.

On Sun, Oct 28, 2012 at 2:49 AM, Peter Schuller <
> wrote:

> > Operation [158320] retried 10 times - error inserting key 0158320
> ((UnavailableException))
> This means that at the point where the thrift request to write data
> was handled, the co-ordinator node (the one your client is connected
> to) believed that, among the replicas responsible for the key, too
> many were down to satisfy the consistency level. Most likely causes
> would be that you're in fact not using RF > 2 (e.g., is the RF really
> > 1 for the keyspace you're inserting into), or you're in fact not
> using ONE.
> > I'm sure my naive setup is flawed in some way, but what I was hoping for
> was when the node went down it would fail to write to the downed node and
> instead write to one of the other nodes in the clusters. So question is why
> are writes failing even after a retry? It might be the stress client
> doesn't pool connections (I took
> Write always go to all responsible replicas that are up, and when
> enough return (according to consistency level), the insert succeeds.
> If replicas fail to respond you may get a TimeoutException.
> UnavailableException means it didn't even try because it didn't have
> enough replicas to even try to write to.
> (Note though: Reads are a bit of a different story and if you want to
> test behavior when nodes go down I suggest including that. See
> CASSANDRA-2540 and CASSANDRA-3927.)
> --
> / Peter Schuller (@scode,

View raw message