tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Cicimov <icici...@gmail.com>
Subject Re: Tuning session replication on clusters
Date Thu, 06 Sep 2012 05:32:49 GMT
On Thu, Sep 6, 2012 at 11:59 AM, <kharper2@oreillyauto.com> wrote:

> Alright, I did some more testing with another application and found the
> following:
>
> Sess    Time (sec
> 10      0.101
> 125     0.101
> 500     0.201
> 1500    0.201
> 1800    0.101
> 2400    0.101
> 42,000  0.901  (that's not a typo)
>
> Turns out the application that was having trouble is storing a silly amount
> of crap in the session.  I am still not 100% sure what's happening behind
> the scenes at the 1500 session mark, but I'm guessing that based on our
> session size (nearly 700 MB) and memory configuration we were hitting some
> heap ceiling and the replication was forced to 'juggle'.  If anyone has any
> more background on what's happening feel free to set me straight.
>
> I'll check back later... I need to go beat some developers...
>
>
> Kyle Harper
>
>
>
>
> From:   kharper2@oreillyauto.com
> To:     "Tomcat Users List" <users@tomcat.apache.org>
> Date:   09/05/2012 07:55 PM
> Subject:        Re: Tuning session replication on clusters
>
>
>
> I'm working with Lee on this as well, so I can help answer most of that.
>
> In short:  Yes, all our replication is working well.  We have keepalived
> acting as a vrrp device (no round-robin dns) in front of a few web servers
> (apache 2.2.x, mod_proxy/mod_ajp) which are using stickysessions and
> BalancerMembers.  Replication (DeltaManager/SimpleTCPCluster)  is working
> as intended on the tomcat side (6.0.24).
>
> After further research, the problem we're seeing is performance with
> replication when the number of sessions is larger than around 2000.  Using
> Jmeter on our test servers I can reproduce the problem.  Here are the times
> it takes to replicate X number of sessions when an application is
> restarted:
> Sess   Time (sec)
> 10               0.101
> 125              0.401
> 500              1.302
> 1500             2.104
> 1800             5.308
> 1800             6.709
> 2400             15.02
> 3600             30.285
> 3600             27.238
>
> The times make sense until around 1500.  The time it takes to replicate
> more than 1500 sessions becomes exponentially worse.  Here is our cluster
> configuration from "node1":
>     <Engine name="Catalina" defaultHost="localhost"
> jvmRoute="tntest-app-a-1">
>       <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
> channelSendOptions="8">
>         <Manager className="org.apache.catalina.ha.session.DeltaManager"
>                  stateTransferTimeout="45"
>                  expireSessionsOnShutdown="false"
>                  notifyListenersOnReplication="true" />
>         <Channel className="org.apache.catalina.tribes.group.GroupChannel">
>           <Membership
> className="org.apache.catalina.tribes.membership.McastService"
>                       address="239.255.0.1"
>                       port="45564"
>                       frequency="500"
>                       dropTime="3000" />
>
>           <Receiver
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>                     address="auto"
>                     port="4000"
>                     autoBind="100"
>                     selectorTimeout="5000"
>                     maxThreads="6" />
>
>           <Sender
> className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
>             <Transport
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
> timeout="45000" />
>           </Sender>
>
>           <Interceptor
>
> className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
>
>           <Interceptor
>
> className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
>
>         </Channel>
>
>         <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
> filter=""/>
>         <Valve
> className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
>
>         <ClusterListener
>
> className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
>
>         <ClusterListener
> className="org.apache.catalina.ha.session.ClusterSessionListener"/>
>       </Cluster>
>
>
> The best time we got for 3600 sessions was 24 seconds, and that's when I
> added the following to the Manager tag (stole this from the 5.5 docs; not
> even sure it's valid in 6.x):
>                  sendAllSessions="false"
>                  sendAllSessionsSize="500"
>                  sendAllSessionsWait="20"
>
>
> What has me stumped is why the time required to do more sessions is
> exponentially higher beyond 1500 sessions.  Using JMeter I can simulate
> 3600 new users (all creating a session) and the two servers can serve the
> requests AND generate/replicate the sessions in under 19 seconds.  Any
> ideas would be greatly appreciated.  I have a full test environment to
> simulate anything you might recommend.
>

Maybe that's the boundary for the 6 threads used for the messages between
the cluster members, having in mind the huge size of your sessions? By
default the Sender uses 25 simultanious connections to each of the other
nodes so maybe increasing this pool might speed up the things (poolSize
value inside the Transport element of the Sender)?


>
> Sincerely,
> Kyle Harper
>
>
>
>
>
> From:            Igor Cicimov <icicimov@gmail.com>
> To:              Tomcat Users List <users@tomcat.apache.org>
> Date:            09/05/2012 07:12 PM
> Subject:                 Re: Tuning session replication on clusters
>
>
>
> On Thu, Sep 6, 2012 at 5:51 AM, <llowder@oreillyauto.com> wrote:
>
> >
> > I have a small cluster of 3 nodes running tomcat 6.0.24 with openJDK
> > 1.6.0_20 on Ubuntu 10.04 LTS.
> >
> > I have roughly 5,000-6,000 sessions at any given time, and when I restart
> > one of the nodes I am finding that not all sessions are getting
> > replicated , even when I have the state transfer  timeout set to 60
> > seconds.
> >
> > It seems that only sessions that have been touched recently are
> replicated,
> > even if the session is still otherwise valid. I did one test where I
> > created about 1,500 sessions and then took out one node, When I brought
> it
> > back online, it only replicated the 4-5 sessions that were from active
> > users on the test cluster. It did not replicated the idle sessions that
> > were still valid that my prior test had created.
> >
> > I  am wanting to tune my settings, but I am unsure where would be the
> best
> > place to start. Should I start with the threads available to the NIO
> > Receiver, or would I be better off focusing on a different set of
> > attributes first, such as the send or receive timeout values?
> >
> > Any tips or pointers as to which setting might be the most productive
> would
> > be greatly appreciated.
> >
> > Lee Lowder
> > O'Reilly Auto Parts
> > Web Systems Administrator
> > (417) 862-2674 x1858
> >
> > This communication and any attachments are confidential, protected by
> > Communications Privacy Act 18 USCS § 2510, solely for the use of the
> > intended recipient, and may contain legally privileged material. If you
> are
> > not the intended recipient, please return or destroy it immediately.
> Thank
> > you.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> >
> >
> For starter does your cluster satisfy the requirements bellow?
>
> To run session replication in your Tomcat 6.0 container, the following
> steps should be completed:
>
>    - All your session attributes must implement java.io.Serializable
>    - Uncomment the Cluster element in server.xml
>    - If you have defined custom cluster valves, make sure you have the
>    ReplicationValve defined as well under the Cluster element in server.xml
>    - If your Tomcat instances are running on the same machine, make sure
>    the tcpListenPort attribute is unique for each instance, in most cases
>    Tomcat is smart enough to resolve this on it's own by autodetecting
>    available ports in the range 4000-4100
>    - Make sure your web.xml has the <distributable/> element
>    - If you are using mod_jk, make sure that jvmRoute attribute is set at
>    your Engine <Engine name="Catalina" jvmRoute="node01" > and that the
>    jvmRoute attribute value matches your worker name in workers.properties
>    - Make sure that all nodes have the same time and sync with NTP service!
>    - Make sure that your loadbalancer is configured for sticky session
> mode.
>
>
> Also you don't say what are you using for load balancing? Not bad to post
> your cluster definition as well.
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
>
> This communication and any attachments are confidential, protected by
> Communications Privacy Act 18 USCS § 2510, solely for the use of the
> intended recipient, and may contain legally privileged material. If you are
> not the intended recipient, please return or destroy it immediately. Thank
> you.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
>
> This communication and any attachments are confidential, protected by
> Communications Privacy Act 18 USCS § 2510, solely for the use of the
> intended recipient, and may contain legally privileged material. If you are
> not the intended recipient, please return or destroy it immediately. Thank
> you.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message