cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-3960) reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
Date Sun, 26 Feb 2012 04:30:48 GMT
reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
----------------------------------------------------------------------------

                 Key: CASSANDRA-3960
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3960
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Peter Schuller
            Assignee: Peter Schuller
            Priority: Minor


The fix committed for CASSANDRA-3626 is incorrect for bootstrapping nodes. The saved endpoint
states include only those fully joined in the ring, so when a node is restarted, all nodes
that are in joining state will be "new" from the perspective of that node. The FD report()
call in the Gossiper bumps the node into UP. This can be negative because it causes requests
to be queued up to the node, which is potentially significant (e.g. GC pressure due to promotion
into old-gen).

In the case I saw this in production the node was *in fact* up so it was okay (but it later
got kicked down due to computational complexity issues and gossip stage being backed up on
start-up, which is how I realized this could be a problem).

Since the impact is limited to affecting writes (since joining nodes don't serve reads), the
negative effects should hopefully be limited to uselessly queueing up a bunch of messages
and confusing operators. So, the issue seems minor right now.

In addition, we currently drop joining nodes away from our notion of the ring very quickly
(see the discussion in CASSANDRA-3895) so the time period during which this behavior has any
impact at all should be small in modern Cassandra (assuming the code to avoid re-popping up
dropped nodes works). My observations have still been on the 0.x branch. However, with CASSANDRA-3892
fixed in the future we can no longer be dropping state about joining nodes and the impact
window is higher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message