cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-515) Gossiper misses first updates when restarting a node
Date Mon, 26 Oct 2009 21:38:59 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-515:
-------------------------------------

    Description: 
Easy way to reproduce:

Start node A.
Start node B, with autobootstrap=false.
Kill B, wipe data dir, and restart (still w/ autobootstrap=false).

A will show B as down, with its old token.  (B will see both nodes correctly.)

This appears to be because when you wipe data dir, generation restarts at 1.  (This is not
just operator error; besides during testing, this could arise if a node dies completely and
has to be replaced.)  Then gossip state is ignored until the new heartbeat is larger than
the old one reached.

It appears that initializing the generation to seconds-since-epoch would fix this.

  was:
Easy way to reproduce:

Start node A.
Start node B, with autobootstrap=false.
Kill B, wipe data dir, and restart (still w/ autobootstrap=false).

A will show B as down, with its old token.  (B will see both nodes correctly.)

This appears to be because when you wipe data dir, generation restarts at 1.  (This is not
just operator error; besides during testing, this could arise if a node dies completely and
has to be replaced.)

It appears that initializing the generation to seconds-since-epoch would fix this.

        Summary: Gossiper misses first updates when restarting a node  (was: Gossiper misses
first update (?) when restarting a node)

> Gossiper misses first updates when restarting a node
> ----------------------------------------------------
>
>                 Key: CASSANDRA-515
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-515
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.5
>
>
> Easy way to reproduce:
> Start node A.
> Start node B, with autobootstrap=false.
> Kill B, wipe data dir, and restart (still w/ autobootstrap=false).
> A will show B as down, with its old token.  (B will see both nodes correctly.)
> This appears to be because when you wipe data dir, generation restarts at 1.  (This is
not just operator error; besides during testing, this could arise if a node dies completely
and has to be replaced.)  Then gossip state is ignored until the new heartbeat is larger than
the old one reached.
> It appears that initializing the generation to seconds-since-epoch would fix this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message