cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Williams (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CASSANDRA-3960) reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
Date Tue, 09 Jul 2013 16:39:48 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Brandon Williams resolved CASSANDRA-3960.
-----------------------------------------

    Resolution: Not A Problem

Closing since we never committed CASSANDRA-3892 and have CASSANDRA-3881 (or CASSANDRA-5135)

                
> reporting "new" nodes to FD in gossiper is incorrect for bootstrapping nodes
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3960
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3960
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>            Priority: Minor
>
> The fix committed for CASSANDRA-3626 is incorrect for bootstrapping nodes. The saved
endpoint states include only those fully joined in the ring, so when a node is restarted,
all nodes that are in joining state will be "new" from the perspective of that node. The FD
report() call in the Gossiper bumps the node into UP. This can be negative because it causes
requests to be queued up to the node, which is potentially significant (e.g. GC pressure due
to promotion into old-gen).
> In the case I saw this in production the node was *in fact* up so it was okay (but it
later got kicked down due to computational complexity issues and gossip stage being backed
up on start-up, which is how I realized this could be a problem).
> Since the impact is limited to affecting writes (since joining nodes don't serve reads),
the negative effects should hopefully be limited to uselessly queueing up a bunch of messages
and confusing operators. So, the issue seems minor right now.
> In addition, we currently drop joining nodes away from our notion of the ring very quickly
(see the discussion in CASSANDRA-3895) so the time period during which this behavior has any
impact at all should be small in modern Cassandra (assuming the code to avoid re-popping up
dropped nodes works). My observations have still been on the 0.x branch. However, with CASSANDRA-3892
fixed in the future we can no longer be dropping state about joining nodes and the impact
window is higher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message