cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filippo Diotalevi <fili...@ntoklo.com>
Subject Nodes marked dead…. leap second?
Date Mon, 02 Jul 2012 09:35:07 GMT
Hi,  
we had some really weird issues during the weekend, with our cassandra nodes starting marking
as dead other (working) nodes in the cluster. That happened all Sunday, and it's still happening.
Node are marked dead and up all the time….

Some example logs:

INFO [GossipTasks:1] 2012-07-02 06:55:01,804 Gossiper.java (line 818) InetAddress /xx.xx.xx.233
is now dead.
INFO [GossipTasks:1] 2012-07-02 06:55:01,805 Gossiper.java (line 818) InetAddress /xx.xx.xx.235
is now dead.
INFO [GossipStage:1] 2012-07-02 06:55:21,748 Gossiper.java (line 804) InetAddress /xx.xx.xx.233
is now UP
INFO [GossipStage:1] 2012-07-02 06:55:21,893 Gossiper.java (line 804) InetAddress /xx.xx.xx.235
is now UP
INFO [GossipTasks:1] 2012-07-02 06:56:03,877 Gossiper.java (line 818) InetAddress /xx.xx.xx.235
is now dead.
INFO [GossipTasks:1] 2012-07-02 06:57:58,537 Gossiper.java (line 818) InetAddress /xx.xx.xx.233
is now dead.
INFO [GossipStage:1] 2012-07-02 06:59:06,444 Gossiper.java (line 804) InetAddress /xx.xx.xx.233
is now UP



I couldn't find any real exception in the logs, but I noticed that the first error occurred
at  
 INFO [GossipTasks:1] 2012-07-01 02:00:31,169 Gossiper.java (line 818) InetAddress /xx.xx.xx.234
is now dead.

2012-07-01 02:00:31,169, in the German timezone were the machine is hosted, is June 30th 23:59:60
UTC, the leap second that caused quite a few issues this weekend.  

Can it be the cause of the cluster failure? Has anybody noticed similar issues? ( also see
https://twitter.com/redditstatus/status/219244389044731904 )

I'm running Ubuntu 10.04.3 LTS.  

Many thanks,
--  
Filippo Diotalevi


Mime
View raw message