tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronald Klop <>
Subject Re: Cluster Memory Leak - ClusterData and LinkObject classes
Date Tue, 01 Apr 2008 10:47:58 GMT
On Mon Mar 31 21:13:25 CEST 2008 Tomcat Users List <> wrote:
> On Mon, Mar 31, 2008 at 3:38 AM, Ronald Klop <> wrote:
> >
> > See my previous mail about send/receive buffers filling because Ack wasn't
> > read by FastAsyncSender.
> > The option waitForAck="true" did the trick for me. But for FastAsyncSender
> > you should set sendAck="false" on the receiving side.
> Thanks for the information, Ronald. Can you clarify your settings by
> by posting a minimal configuration? I looked for the option sendAck on
> the Tomcat cluster page and couldn't find any reference to that
> configuration parameter:
> It looks like doing something like one of the two is a good idea for a
> barebones setup to make sure that the acking behavior is consistent
> since Tomcat doesn't seem to ensure that they are sane:
> <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
> receiver.sendAck="true" sender.waitForAck="true"/>
> <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
> receiver.sendAck="false" sender.waitForAck="false"/>
> I'm a bit confused as to why this issue only affects one of my
> clusters (out of 3 production clusters with identical setups) and not
> more people are seeing it. Are most people specifying their Ack
> settings? Or do most people not see enough traffic between restarts to
> trigger this issue? Granted, the one that's affected also happens to
> handle the most traffic by far. I'll have to do more testing on my
> test cluster to verify (I've already turned on waitForAck everywhere
> in production), hopefully I can reproduce it.
> Anyone have information on how using Acks in the cluster affects performance?
> -Dave

Hello Dave,

I attached my server.xml file. I hope the mailinglist doesn't filter it.

I think a lot of people don't send enough sessions between the nodes to see this or they use
sticky sessions.
The problem is in the Acks not being read by Tomcat. The network receive buffer fills. Only
after the receive buffer is full the send buffer on the other node (send acks) is filling.
And only after that send buffer is full the node stops sending acks and reading sessions,
but blocks in Socket.write(ack_buffer). After this node is not reading new session data from
the network anymore and only after that moment you will experience failures in your application.

An Ack is 3 bytes, so you need to sync a lot of sessions, before the receive buffer and send
buffer fill up.
My receive buffers are about 90KB and send buffers are 32 KB.
(90KB + 32KB) / 3 bytes = 41643 acks before the syncing stops.

I see (at this moment) on average 1.5 session messages per second. So it takes me 30000 seconds
= 8 hours before my clustering stops.

But I could see the receive buffer filling up with 3 bytes a time also in a lab environment.
(Use netstat for example to see this.)

Why I was seeing this problems only since two weeks is a mistery to me too.


View raw message