cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reverend Chip <>
Subject Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL
Date Tue, 07 Dec 2010 00:58:58 GMT
I'm running a big test -- ten nodes with 3T disk each.  I'm using
0.7.0rc1.  After some tuning help (thanks Tyler) lots of this is working
as it should.  However a serious event occurred as well -- the server
froze up -- and though mutations were dropped, no error was reported to
the client.  Here's what the log said on host X.19:

 WARN [ScheduledTasks:1] 2010-12-06 14:04:11,125
(line 527) Dropped 76 MUTATION messages in the last 5000ms

Meanwhile, on the OTHER nodes, gossip decided the node was not available
for a while:

 INFO [ScheduledTasks:1] 2010-12-06 14:04:02,396 (line
195) InetAddress /X.19 is now dead.
 INFO [GossipStage:1] 2010-12-06 14:04:06,127 (line 569)
InetAddress /X.19 is now UP

And despite the fact that I was writing with consistency=ALL, none of my
clients reported any errors on their mutations.

Tyler has this information but I would like to know if anyone has seen
this before, and/or has a diagnosis.

View raw message