cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "varun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11093) processs restarts are failing becase native port and jmx ports are in use
Date Fri, 19 Feb 2016 16:22:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154447#comment-15154447
] 

varun commented on CASSANDRA-11093:
-----------------------------------

Hi [~beobal],

It says patch is available. Which patch version is this referring to? I cannot find the details
in the ticket.

Thanks

> processs restarts are failing becase native port and jmx ports are in use
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11093
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11093
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Configuration
>         Environment: PROD
>            Reporter: varun
>            Assignee: Sam Tunnicliffe
>            Priority: Minor
>              Labels: lhf
>
> A process restart should automatically take care of this. But it is not and it is a problem.
> The ports are are considered in use even if the process has quit/died/killed but the
socket is in a TIME_WAIT state in the TCP FSM (http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF-2.htm).
> tcp 0 0 127.0.0.1:7199 0.0.0.0:* LISTEN 30099/java
> tcp 0 0 192.168.1.2:9160 0.0.0.0:* LISTEN 30099/java
> tcp 0 0 10.130.128.131:58263 10.130.128.131:9042 TIME_WAIT -
> tcp 0 0 10.130.128.131:58262 10.130.128.131:9042 TIME_WAIT -
> tcp 0 0 ::ffff:10.130.128.131:9042 :::* LISTEN 30099/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.130.128.131:57191 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.130.128.131:57190 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.176.70.226:37105 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:127.0.0.1:42562 ::ffff:127.0.0.1:7199 TIME_WAIT -
> tcp 0 0 ::ffff:10.130.128.131:57190 ::ffff:10.130.128.131:9042 ESTABLISHED 30138/java
> tcp 0 0 ::ffff:10.130.128.131:57198 ::ffff:10.130.128.131:9042 ESTABLISHED 30138/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.176.70.226:37106 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:10.130.128.131:57197 ::ffff:10.130.128.131:9042 ESTABLISHED 30138/java
> tcp 0 0 ::ffff:10.130.128.131:57191 ::ffff:10.130.128.131:9042 ESTABLISHED 30138/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.130.128.131:57198 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:10.130.128.131:9042 ::ffff:10.130.128.131:57197 ESTABLISHED 30099/java
> tcp 0 0 ::ffff:127.0.0.1:42567 ::ffff:127.0.0.1:7199 TIME_WAIT -
> I had to write a restart handler that does a netstat call and looks to make sure all
the TIME_WAIT states exhaust before starting the node back up. This happened on 26 of the
56 when a rolling restart was performed. The issue was mostly around JMX port 7199. There
was another rollling restart done on the 26 nodes to remediate the JMX ports issue in that
restart one node had the issue where port 9042 was considered used after the restart and the
process died after a bit of time.
> What needs to be done for port the native port 9042 and JMX port 7199 is to create the
underlying TCP socket with SO_REUSEADDR. This eases the restriction and allows the port to
be bound by process even if there are sockets open to that port in the TCP FSM, as long as
there is no other process listening on that port. There is a Java method available to set
this option in java.net.Socket https://docs.oracle.com/javase/7/docs/api/java/net/Socket.html#setReuseAddress%28boolean%29.
> native port 9042: https://github.com/apache/cassandra/blob/4a0d1caa262af3b6f2b6d329e45766b4df845a88/tools/stress/src/org/apache/cassandra/stress/settings/SettingsPort.java#L38
> JMX port 7199: https://github.com/apache/cassandra/blob/4a0d1caa262af3b6f2b6d329e45766b4df845a88/tools/stress/src/org/apache/cassandra/stress/settings/SettingsPort.java#L40
> Looking in the code itself this option is being set on thrift (9160 (default)) and internode
communication ports, uncrypted (7000 (default)) and SSL encrypted (7001 (default)) .
> https://github.com/apache/cassandra/search?utf8=%E2%9C%93&q=setReuseAddress
> This needs to be set to native and jmx ports as well.
> References:
> https://unix.stackexchange.com/questions/258379/when-is-a-port-considered-being-used/258380?noredirect=1
> https://stackoverflow.com/questions/23531558/allow-restarting-java-application-with-jmx-monitoring-enabled-immediately
> https://docs.oracle.com/javase/8/docs/technotes/guides/rmi/socketfactory/
> https://github.com/apache/cassandra/search?utf8=%E2%9C%93&q=setReuseAddress
> https://docs.oracle.com/javase/7/docs/api/java/net/Socket.html#setReuseAddress%28boolean%293
> https://github.com/apache/cassandra/blob/4a0d1caa262af3b6f2b6d329e45766b4df845a88/tools/stress/src/org/apache/cassandra/stress/settings/SettingsPort.java#L38
> https://github.com/apache/cassandra/blob/4a0d1caa262af3b6f2b6d329e45766b4df845a88/tools/stress/src/org/apache/cassandra/stress/settings/SettingsPort.java#L40



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message