cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SathishKumar Alwar (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9630) Killing cassandra process results in unclosed connections
Date Wed, 03 Jan 2018 01:26:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308966#comment-16308966
] 

SathishKumar Alwar edited comment on CASSANDRA-9630 at 1/3/18 1:25 AM:
-----------------------------------------------------------------------

Is there a plan to fix this issue, we are observing the same behavior in 3.9 version. We have
3 node Cassandra cluster running in 3 VMs. When we reboot one of the node (say VM1), we noticed
socket connections on the other nodes (say VM2, VM3) are still in CLOSE_WAIT state, hence
when we start Cassandra on the rebooted node it is not able to join the cluster. We observed
nodetool status returning "UN" for itself and "DN" for other 2 nodes, however after 5-20 minutes
we notice "Connection Timeout" exception in debug.log on the other 2 nodes (VM2 and VM3) and
new socket connection being established and they are able to join the cluster.



was (Author: sathish_alwar):
Is there a plan to fix this issue, we are observing the same behavior.

> Killing cassandra process results in unclosed connections
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9630
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9630
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Distributed Metadata, Streaming and Messaging
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Minor
>             Fix For: 3.11.x
>
>         Attachments: apache-cassandra-3.0.8-SNAPSHOT.jar
>
>
> After upgrading from Cassandra from 2.0.12 to 2.0.15, whenever we killed a cassandra
process (with SIGTERM), some other nodes maintained a connection with the killed node in the
CLOSE_WAIT state on port 7000 for about 5-20 minutes.
> So, when we started the killed node again, other nodes could not establish a handshake
because of the connections on the CLOSE_WAIT state, so they remained on the DOWN state to
each other until the initial connection expired.
> The problem did not happen if I ran a nodetool disablegossip before killing the node.
> I was able to fix this issue by reverting the CASSANDRA-8336 commits (including CASSANDRA-9238).
After reverting this, cassandra now closes connection correctly when killed with -TERM, but
leaves connections on CLOSE_WAIT state if I run nodetool disablethrift before killing the
nodes.
> I did not try to reproduce the problem in a clean environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message