The situation I am seeing is this:

To access my companies development environment I need to VPN.

I do some development on the application, and for some reason my VPN drops, but I had established connections to my development cassandra server.

When I reconnect and check netstat I see the connections I had established previously still there, and they never go away. I have had connections that are held open from almost 7 days ago.

I ran 'netstat -tulpn' per the request of Nate McCall, and the receive and send queues are 0.

I just did a test where I changed the code of my application to use thrift (using the FluentCassandra driver). Start the application, kill my vpn connection, reconnect. When I check the cassandra server, I still see the thrift (9160) connection established, but it is eventually removed because of keep alive.

If I change rpc_keepalive to false in cassandra.yaml and restart cassandra then run the same test I outlined above using thrift the connection will stay, like the native transport connections, until cassandra, or the box, is restarted.

It seems the lack of keep alive support for native transport is the culprit.

Regards,

Eric Plowe


On Fri, Apr 11, 2014 at 1:12 PM, Nate McCall <nate@thelastpickle.com> wrote:
Out of curiosity, any folks seeing backups in the send or receive queues via netstat while this is happening? (netstat -tulpn for example)

I feel like I had this happen once and it ended up being a sysconfig tuning issue (net.core.* and net.ipv4.* stuff specifically). 

Can't seem to find anything in my notes though, unfortunately. 


On Fri, Apr 11, 2014 at 10:16 AM, Phil Luckhurst <phil.luckhurst@powerassure.com> wrote:
We have considered this but wondered how well it would work as the Cassandra
Java Driver opens multiple connections internally to each Cassandra node. I
suppose it depends how those connections are used internally, if it's round
robin then it should work. Perhaps we just need to to try it.

--
Thanks
Phil


Chris Lohfink wrote
> TCP keep alives (by the setTimeout) are notoriously useless...  The
> default
> 2 hours is generally far longer then any timeout in NAT translation tables
> (generally ~5 min) and even if you decrease the keep alive to a sane value
> a log of networks actually throw away TCP keep alive packets.  You see
> that
> a lot more in cell networks though.  Its almost always a good idea to have
> a software keep alive although it seems to be not implemented in this
> protocol.  You can make a super simple CF with 1 value and query it every
> minute a connection is idle or something.  i.e. "select * from DummyCF
> where id = 1"
>
> --
> *Chris Lohfink*
> Engineer
> 415.663.6738  |  Skype: clohfink.blackbirdit
> *Blackbird **[image: favicon]*
>
> 775.345.3485  |  www.blackbirdIT.com &lt;http://www.blackbirdit.com/&gt;
>
> *"Formerly PalominoDB/DriveDev"*
>
>
> On Fri, Apr 11, 2014 at 3:04 AM, Phil Luckhurst <

> phil.luckhurst@

>> wrote:
>
>> We are also seeing this in our development environment. We have a 3 node
>> Cassandra 2.0.5 cluster running on Ubuntu 12.04 and are connecting from a
>> Tomcat based application running on Windows using the 2.0.0 Cassandra
>> Java
>> Driver. We have setKeepAlive(true) when building the cluster in the
>> application and this does keep one connection open on the client side to
>> each of the 3 Cassandra nodes, but we still see the build up of 'old'
>> ESTABLISHED connections on each of the Cassandra servers.
>>
>> We are also getting that same "Unexpected exception during request"
>> exception appearing in the logs
>>
>> ERROR [Native-Transport-Requests:358378] 2014-04-09 12:31:46,824
>> ErrorMessage.java (line 222) Unexpected exception during request
>> java.io.IOException: Connection reset by peer
>>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>         at sun.nio.ch.SocketDispatcher.read(Unknown Source)
>>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>>         at sun.nio.ch.IOUtil.read(Unknown Source)
>>         at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
>>         at
>>
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
>>         at
>>
>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>         at
>>
>> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>>         at
>> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source)
>>         at java.lang.Thread.run(Unknown Source)
>>
>> Initially we thought this was down to a firewall that is between our
>> development machines and the Cassandra nodes but that has now been
>> configured not to 'kill' any connections on port 9042. We also have the
>> Windows firewall on the client side turned off.
>>
>> We still think this is down to our environment as the same application
>> running in Tomcat hosted on a Ubuntu 12.04 server does not appear to be
>> doing this but up to now we can't track down the cause.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/binary-protocol-server-side-sockets-tp7593879p7593937.html
>> Sent from the

> cassandra-user@.apache

>  mailing list archive at
>> Nabble.com.
>>
>
>
>
> --
> *Chris Lohfink*
> Engineer
> 415.663.6738  |  Skype: clohfink.blackbirdit
>
> *Blackbird **[image: favicon]*
>
> 775.345.3485  |  www.blackbirdIT.com &lt;http://www.blackbirdit.com/&gt;
>
> *"Formerly PalominoDB/DriveDev"*
>
>
> image001.png (5K)
> &lt;http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/attachment/7593947/0/image001.png&gt;





--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/binary-protocol-server-side-sockets-tp7593879p7593955.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.



--
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com