incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henrik Schröder <skro...@gmail.com>
Subject Re: Nodes marked dead…. leap second?
Date Mon, 02 Jul 2012 12:21:54 GMT
Bug: https://lkml.org/lkml/2012/6/30/122

Simple fix to reset the leap second flag: date; date `date
+"%m%d%H%M%C%y.%S"`; date;


/Henrik

On Mon, Jul 2, 2012 at 1:56 PM, Jean Paul Adant
<jean.paul.adant@gmail.com>wrote:

> Hi,
>
> I did have the same problem with cassandra 1.1.1 on Ubuntu 11.10
> I had to reboot all nodes
> I'm interested in any information about this.
>
> Thanks
>
> Jean Paul
>
> 2012/7/2 Filippo Diotalevi <filippo@ntoklo.com>
>
>>  Hi,
>> we had some really weird issues during the weekend, with our cassandra
>> nodes starting marking as dead other (working) nodes in the cluster. That
>> happened all Sunday, and it's still happening. Node are marked dead and up
>> all the time….
>>
>> Some example logs:
>>
>> INFO [GossipTasks:1] 2012-07-02 06:55:01,804 Gossiper.java (line 818)
>> InetAddress /xx.xx.xx.233 is now dead.
>> INFO [GossipTasks:1] 2012-07-02 06:55:01,805 Gossiper.java (line 818)
>> InetAddress /xx.xx.xx.235 is now dead.
>> INFO [GossipStage:1] 2012-07-02 06:55:21,748 Gossiper.java (line 804)
>> InetAddress /xx.xx.xx.233 is now UP
>> INFO [GossipStage:1] 2012-07-02 06:55:21,893 Gossiper.java (line 804)
>> InetAddress /xx.xx.xx.235 is now UP
>> INFO [GossipTasks:1] 2012-07-02 06:56:03,877 Gossiper.java (line 818)
>> InetAddress /xx.xx.xx.235 is now dead.
>> INFO [GossipTasks:1] 2012-07-02 06:57:58,537 Gossiper.java (line 818)
>> InetAddress /xx.xx.xx.233 is now dead.
>> INFO [GossipStage:1] 2012-07-02 06:59:06,444 Gossiper.java (line 804)
>> InetAddress /xx.xx.xx.233 is now UP
>>
>>
>> I couldn't find any real exception in the logs, but I noticed that the
>> first error occurred at
>>  INFO [GossipTasks:1] 2012-07-01 02:00:31,169 Gossiper.java (line 818)
>> InetAddress /xx.xx.xx.234 is now dead.
>>
>> 2012-07-01 02:00:31,169, in the German timezone were the machine is
>> hosted, is June 30th 23:59:60 UTC, the leap second that caused quite a few
>> issues this weekend.
>>
>> Can it be the cause of the cluster failure? Has anybody noticed similar
>> issues? ( also see
>> https://twitter.com/redditstatus/status/219244389044731904 )
>>
>> I'm running Ubuntu 10.04.3 LTS.
>>
>> Many thanks,
>> --
>> Filippo Diotalevi
>>
>>
>
>
> --
> -----------------------------------------------------
> Jean Paul Adant - Créative-Ingénierie
> jean.paul.adant@gmail.com
>
>
>
>

Mime
View raw message