incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Kogan <...@iqtell.com>
Subject Node went down and came back up
Date Sun, 05 May 2013 12:23:43 GMT
Hello,

Last night one of our nodes froze and the server had to be rebooted.  After it came up, the
node joined the ring and everything looked normal.
However, this morning there seem to be some inconsistencies in the data (e.g. some nodes don't
have a given record or have a different version of the record than other node).

There are also a lot of messages about hinted handoff in the logs that started after the node
failure.
Like these:

INFO [HintedHandoff:1] 2013-05-05 11:22:23,339 HintedHandOffManager.java (line 294) Started
hinted handoff for token: 56713727820156410577229101238628035242 with IP: /107.20.45.6
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,343 HintedHandOffManager.java (line 372) Timed
out replaying hints to /107.20.45.6; aborting further deliveries
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 HintedHandOffManager.java (line 390) Finished
hinted handoff of 0 rows to endpoint /107.20.45.6
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 HintedHandOffManager.java (line 294) Started
hinted handoff for token: 0 with IP: /67.202.15.178
 INFO [HintedHandoff:1] 2013-05-05 11:22:43,348 HintedHandOffManager.java (line 372) Timed
out replaying hints to /67.202.15.178; aborting further deliveries
 INFO [HintedHandoff:1] 2013-05-05 11:22:43,348 HintedHandOffManager.java (line 390) Finished
hinted handoff of 0 rows to endpoint /67.202.15.178

Do we need to run repair on all nodes to get the cluster back to "normal" state?

Thanks for the help.

Dan Kogan
Mime
View raw message