Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Tomcat Users List" <users@tomcat.apache.org>
Received-SPF: pass (nike.apache.org: local policy)
Date: Tue, 1 Apr 2008 12:47:58 +0200 (CEST)
From: Ronald Klop <ronald-mailinglist@base.nl>
To: Tomcat Users List <users@tomcat.apache.org>
Cc: David Rees <drees76@gmail.com>
Message-ID: <30796653.86341207046878004.JavaMail.tomcat@localhost>
In-Reply-To: <72dbd3150803311213y2896821h70b06fe974c8bee9@mail.gmail.com>
References: <72dbd3150803300149p5566a69drf72c342f7610349@mail.gmail.com>
	 <72dbd3150803300214i2a2ffd67m375e2c4c534639b0@mail.gmail.com>
	 <72dbd3150803301707i444b58b8n9b1e6e973218a85b@mail.gmail.com>
	 <21058284.48401206959892221.JavaMail.tomcat@localhost>
 <72dbd3150803311213y2896821h70b06fe974c8bee9@mail.gmail.com>
Subject: Re: Cluster Memory Leak - ClusterData and LinkObject classes
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_Part_39755_9027262.1207046878002"
Importance: Normal

------=_Part_39755_9027262.1207046878002
Content-Type: multipart/alternative;
	boundary="----=_Part_39754_3407492.1207046877949"

------=_Part_39754_3407492.1207046877949
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

On Mon Mar 31 21:13:25 CEST 2008 Tomcat Users List <users@tomcat.apache.org> wrote:
> On Mon, Mar 31, 2008 at 3:38 AM, Ronald Klop <ronald-mailinglist@base.nl> wrote:
> >
> > See my previous mail about send/receive buffers filling because Ack wasn't
> > read by FastAsyncSender.
> > The option waitForAck="true" did the trick for me. But for FastAsyncSender
> > you should set sendAck="false" on the receiving side.
> 
> Thanks for the information, Ronald. Can you clarify your settings by
> by posting a minimal configuration? I looked for the option sendAck on
> the Tomcat cluster page and couldn't find any reference to that
> configuration parameter:
> http://tomcat.apache.org/tomcat-5.5-doc/cluster-howto.html
> 
> It looks like doing something like one of the two is a good idea for a
> barebones setup to make sure that the acking behavior is consistent
> since Tomcat doesn't seem to ensure that they are sane:
> 
> <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
> receiver.sendAck="true" sender.waitForAck="true"/>
> 
> <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
> receiver.sendAck="false" sender.waitForAck="false"/>
> 
> I'm a bit confused as to why this issue only affects one of my
> clusters (out of 3 production clusters with identical setups) and not
> more people are seeing it. Are most people specifying their Ack
> settings? Or do most people not see enough traffic between restarts to
> trigger this issue? Granted, the one that's affected also happens to
> handle the most traffic by far. I'll have to do more testing on my
> test cluster to verify (I've already turned on waitForAck everywhere
> in production), hopefully I can reproduce it.
> 
> Anyone have information on how using Acks in the cluster affects performance?
> 
> -Dave
> 

Hello Dave,

I attached my server.xml file. I hope the mailinglist doesn't filter it.

I think a lot of people don't send enough sessions between the nodes to see this or they use sticky sessions.
The problem is in the Acks not being read by Tomcat. The network receive buffer fills. Only after the receive buffer is full the send buffer on the other node (send acks) is filling. And only after that send buffer is full the node stops sending acks and reading sessions, but blocks in Socket.write(ack_buffer). After this node is not reading new session data from the network anymore and only after that moment you will experience failures in your application.

An Ack is 3 bytes, so you need to sync a lot of sessions, before the receive buffer and send buffer fill up.
My receive buffers are about 90KB and send buffers are 32 KB.
(90KB + 32KB) / 3 bytes = 41643 acks before the syncing stops.

I see (at this moment) on average 1.5 session messages per second. So it takes me 30000 seconds = 8 hours before my clustering stops.

But I could see the receive buffer filling up with 3 bytes a time also in a lab environment. (Use netstat for example to see this.)


Why I was seeing this problems only since two weeks is a mistery to me too.

Ronald.


------=_Part_39754_3407492.1207046877949
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<p>On Mon Mar 31 21:13:25 CEST 2008 Tomcat Users List &lt;users@tomcat.apache.org&gt; wrote:</p>
<blockquote style="border-left: 2px solid rgb(0, 0, 0); padding-right: 0px; padding-left: 5px; margin-left: 5px; margin-right: 0px;">On Mon, Mar 31, 2008 at 3:38 AM, Ronald Klop &lt;ronald-mailinglist@base.nl&gt; wrote:<br />
&gt;<br />
&gt; See my previous mail about send/receive buffers filling because Ack wasn't<br />
&gt; read by FastAsyncSender.<br />
&gt;  The option waitForAck="true" did the trick for me. But for FastAsyncSender<br />
&gt; you should set sendAck="false" on the receiving side.<br />
<br />
Thanks for the information, Ronald. Can you clarify your settings by<br />
by posting a minimal configuration? I looked for the option sendAck on<br />
the Tomcat cluster page and couldn't find any reference to that<br />
configuration parameter:<br />
http://tomcat.apache.org/tomcat-5.5-doc/cluster-howto.html<br />
<br />
It looks like doing something like one of the two is a good idea for a<br />
barebones setup to make sure that the acking behavior is consistent<br />
since Tomcat doesn't seem to ensure that they are sane:<br />
<br />
&lt;Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"<br />
receiver.sendAck="true" sender.waitForAck="true"/&gt;<br />
<br />
&lt;Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"<br />
receiver.sendAck="false" sender.waitForAck="false"/&gt;<br />
<br />
I'm a bit confused as to why this issue only affects one of my<br />
clusters (out of 3 production clusters with identical setups) and not<br />
more people are seeing it. Are most people specifying their Ack<br />
settings? Or do most people not see enough traffic between restarts to<br />
trigger this issue? Granted, the one that's affected also happens to<br />
handle the most traffic by far. I'll have to do more testing on my<br />
test cluster to verify (I've already turned on waitForAck everywhere<br />
in production), hopefully I can reproduce it.<br />
<br />
Anyone have information on how using Acks in the cluster affects performance?<br />
<br />
-Dave<br />
<br />
</blockquote><br />
Hello Dave,<br />
<br />
I attached my server.xml file. I hope the mailinglist doesn't filter it.<br />
<br />
I think a lot of people don't send enough sessions between the nodes to see this or they use sticky sessions.<br />
The problem is in the Acks not being read by Tomcat. The network receive buffer fills. Only after the receive buffer is full the send buffer on the other node (send acks) is filling. And only after that send buffer is full the node stops sending acks and reading sessions, but blocks in Socket.write(ack_buffer). After this node is not reading new session data from the network anymore and only after that moment you will experience failures in your application.<br />
<br />
An Ack is 3 bytes, so you need to sync a lot of sessions, before the receive buffer and send buffer fill up.<br />
My receive buffers are about 90KB and send buffers are 32 KB.<br />
(90KB + 32KB) / 3 bytes = 41643 acks before the syncing stops.<br />
<br />
I see (at this moment) on average 1.5 session messages per second. So it takes me 30000 seconds = 8 hours before my clustering stops.<br />
<br />
But I could see the receive buffer filling up with 3 bytes a time also in a lab environment. (Use netstat for example to see this.)<br />
<br />
<br />
Why I was seeing this problems only since two weeks is a mistery to me too.<br />
<br />
Ronald.<br />
<br />
<br type="_moz" />
------=_Part_39754_3407492.1207046877949--

------=_Part_39755_9027262.1207046878002
Content-Type: text/xml; charset=us-ascii; name=server.xml
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename=server.xml

<Server port="8005" shutdown="SHUTDOWN" debug="0">
  <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" />
  <Listener className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" />
  <Listener className="org.apache.catalina.storeconfig.StoreConfigLifecycleListener"/>

  <GlobalNamingResources>

    <!-- Editable user database that can also be used by
         UserDatabaseRealm to authenticate users -->
    <Resource name="UserDatabase" auth="Container"
              type="org.apache.catalina.UserDatabase"
       description="User database that can be updated and saved"
           factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
          pathname="conf/tomcat-users.xml" />

  </GlobalNamingResources>

  <!-- Define the Tomcat Stand-Alone Service -->
  <Service name="Catalina">
    <!-- Define a non-SSL HTTP/1.1 Connector on port 8080 -->
    <Connector port="8080" maxHttpHeaderSize="8192"
               maxThreads="300" minSpareThreads="25" maxSpareThreads="75"
               enableLookups="false" redirectPort="8443" acceptCount="1024"
               connectionTimeout="20000" disableUploadTimeout="true"
               compression="on"
               compressableMimeTypes="text/html,text/xml,text/plain,text/javascript,text/css"/>

    <!-- Define a Coyote/JK2 AJP 1.3 Connector on port 8009 -->
    <!--
    <Connector port="8009"
               enableLookups="false" redirectPort="8443" debug="0"
               protocol="AJP/1.3" />
    -->

    <Engine name="Catalina" defaultHost="localhost">

      <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
             resourceName="UserDatabase"/>
        <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
          manager.className="org.apache.catalina.cluster.session.DeltaManager"
          manager.stateTransferTimeout="60"                                               manager.sendAllSessions="false"     
          manager.sendAllSessionsSize="500"
          manager.sendAllSessionsWaitTime="20"
		  receiver.sendAck="false"
		  sender.waitForAck="false"/>

      <Host name="localhost"
            appBase="/usr/local/crm-CRM23/deployed"
            unpackWARs="true" autoDeploy="false" reloadable="false"
            usePooling="false"
            xmlValidation="false" xmlNamespaceAware="true">
      </Host>

    </Engine>
  </Service>
</Server>


------=_Part_39755_9027262.1207046878002
Content-Type: text/plain; charset=us-ascii

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org
------=_Part_39755_9027262.1207046878002--