incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike <>
Subject Cassandra flush spin?
Date Sat, 09 Feb 2013 17:29:15 GMT

We just hit a very odd issue in our Cassandra cluster.  We are running 
Cassandra 1.1.2 in a 6 node cluster.  We use a replication factor of 3, 
and all operations utilize LOCAL_QUORUM consistency.

We noticed a large performance hit in our application's maintenance 
activities and I've been investigating.  I discovered a node in the 
cluster that was flushing a memtable like crazy.  It was flushing every 
2->3 minutes, and has been apparently doing this for days. Typically, 
during this time of day, a flush would happen every 30 minutes or so. "cat /var/log/cassandra/system.log | grep \"flushing 
high-traffic column family CFS(Keyspace='open', ColumnFamily='msgs')\" | 
grep 02-08 | wc -l"
[1] 18:41:04 [SUCCESS] db-1c-1
[2] 18:41:05 [SUCCESS] db-1c-2
[3] 18:41:05 [SUCCESS] db-1a-1
[4] 18:41:05 [SUCCESS] db-1d-2
[5] 18:41:05 [SUCCESS] db-1a-2
[6] 18:41:05 [SUCCESS] db-1d-1

I restarted the database node, and, at least for now, the problem 
appears to have stopped.

There are a number of things that don't make sense here.  We use a 
replication factor of 3, so if this was being caused by our application, 
I would have expected 3 nodes in the cluster to have issues.  Also, I 
would have expected the issue to continue once the node restarted.

Another information point of interest, and I'm wondering if its exposed 
a bug, was this node was recently converted to use ephemeral storage on 
EC2, and was restored from a snapshot.  After the restore, a nodetool 
repair was run.  However, repair was going to run into some heavy 
activity for our application, and we canceled that validation compaction 
(2 of the 3 anti-entropy sessions had completed).  The spin appears to 
have started at the start of the second session.

Any hints?


View raw message