cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10371) Decommissioned nodes can remain in gossip
Date Tue, 22 Dec 2015 14:26:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068168#comment-15068168
] 

Stefania commented on CASSANDRA-10371:
--------------------------------------

It seems this node received a new GOSSIP state entry for 192.168.128.28 from another node,
do you see a message that says {{Received a GossipDigestSynMessage from...}} a few lines up?
The problem would be in that node. On this node, 192.168.128.28 is evicted, so {{epState.isAlive}}
is false (which is what this ticket is all about):

{code}
    long expireTime = getExpireTimeForEndpoint(endpoint);
    if (!epState.isAlive() && (now > expireTime)
        && (!StorageService.instance.getTokenMetadata().isMember(endpoint)))
        {
            if (logger.isDebugEnabled())
            {
                logger.debug("time is expiring for endpoint : {} ({})", endpoint, expireTime);
            }
            evictFromMembership(endpoint);
        }
{code}

Also, have you noticed how the time is in the past for the last 3 lines?

> Decommissioned nodes can remain in gossip
> -----------------------------------------
>
>                 Key: CASSANDRA-10371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Distributed Metadata
>            Reporter: Brandon Williams
>            Assignee: Stefania
>            Priority: Minor
>
> This may apply to other dead states as well.  Dead states should be expired after 3 days.
 In the case of decom we attach a timestamp to let the other nodes know when it should be
expired.  It has been observed that sometimes a subset of nodes in the cluster never expire
the state, and through heap analysis of these nodes it is revealed that the epstate.isAlive
check returns true when it should return false, which would allow the state to be evicted.
 This may have been affected by CASSANDRA-8336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message