incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Theroux <mthero...@yahoo.com>
Subject Really odd issue (AWS related?)
Date Wed, 24 Apr 2013 12:03:17 GMT
Hello,

Since Sunday, we've been experiencing a really odd issue in our Cassandra cluster.  We recently
started receiving errors that messages are being dropped.  But here is the odd part...

When looking in the AWS console, instead of seeing statistics being elevated during this time,
we actually see all statistics suddenly drop right before these messages appear.  CPU, I/O,
and network go way down.  In fact, in one case, they went to 0 for about 5 minutes to the
point that other cassandra nodes saw this specific node in question as being down.  The messages
appear right after the node "wakes up".

We've had this happen on 3 different nodes on three different days since Sunday.

Other facts:

- We recently upgraded from m1.large to m1.xlarge instances about two weeks ago.
- We are running Cassandra 1.1.9
- We've been doing some memory tuning, although I have seen this happen on untuned nodes.

Has anyone seen anything like this before?

Another related question.  Once we see messages being dropped on one node, our cassandra client
appears to see this, reporting errors.  We use LOCAL_QUORUM with a RF of 3 on all queries.
 Any idea why clients would see an error?  If only one node reports an error, shouldn't the
consistency level prevent the client from seeing an issue?

Thanks for your help,
-Mike
Mime
View raw message