tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Session Clustering Monitoring
Date Tue, 13 Jan 2015 14:32:50 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Peter,

On 1/12/15 4:32 PM, Peter Rifel wrote:
> On 1/12/15, 11:36 AM, "Christopher Schultz"
> <chris@christopherschultz.net> wrote: On 1/12/15 2:28 PM, Peter
> Rifel wrote:
>>>> Chris,
>>>> 
>>>> On 1/12/15, 11:08 AM, "Christopher Schultz" 
>>>> <chris@christopherschultz.net> wrote:
>>>> 
>>>> Peter,
>>>> 
>>>> On 1/12/15 12:51 PM, Peter Rifel wrote:
>>>>>>> I'm running Tomcat 8.0.15 with Java 1.8.0_25 on Ubuntu
>>>>>>> 14.04. We have 5 instances that are all setup with
>>>>>>> session clustering as follows:
>>>>>>> 
>>>>>>> <Cluster 
>>>>>>> className="org.apache.catalina.ha.tcp.SimpleTcpCluster">
>>>>>>>
>>>>>>> 
<Manager
>>>>>>> className="org.apache.catalina.ha.session.DeltaManager"
>>>>>>>
>>>>>>> 
stateTransferTimeout="5" /> <Channel
>>>>>>> className="org.apache.catalina.tribes.group.GroupChannel">
>>>>>>>
>>>>>>> 
<Membership
>>>>>>> className="org.apache.catalina.tribes.membership.McastService"
>>>>>>>
>>>>>>>
>
>>>>>>> 
address="${multicast}" /> </Channel> </Cluster>
>>>>>>> 
>>>>>>> -Dmulticast=228.0.0.4
>>>>>>> 
>>>>>>> To help prevent accidental misconfigurations that have 
>>>>>>> occurred in the past, I decided to implement monitoring
>>>>>>> on the session replication by checking the JMX mbean 
>>>>>>> Catalina/Manager/<host>/<context>/activeSessions
>>>>>>> attribute. Most of the time the values for the 5
>>>>>>> instances are all within 1 or 2 of each other. Over the
>>>>>>> weekend we consistently had one instance that had more
>>>>>>> sessions than the other 4. It began with 102 sessions
>>>>>>> where every other instance had 95. Over the next 36
>>>>>>> hours as more sessions were expiring over the weekend,
>>>>>>> the difference grew to 49 vs 29. Eventually it resynced
>>>>>>> and now they all report the same active session count.
>>>>>>> My question is, does anyone know why this would happen,
>>>>>>> and if this can be expected is there a better way to 
>>>>>>> monitor session replication to ensure that there isn't
>>>>>>> one instance that isn't being replicated to? I believe
>>>>>>> this only happens on weekends when most sessions are
>>>>>>> expiring and very few are being created but I may be
>>>>>>> wrong.
>>>> 
>>>> How is your load-balancer configured to distribute traffic?
>>>> 
>>>>> Two of the instances are behind one load balancer, and the
>>>>> other 3 are behind another.  They each provide a different
>>>>> service but are running the same war application and we
>>>>> want sessions clustered across both services. Each load
>>>>> balancer's initial distribution is based on the least
>>>>> number of connections, with persistence based on source
>>>>> IP.
> 
> So basically all requests are randomly sent to back-end nodes? Or
> are you using session stickiness or anything like that?
> 
>> Sorry, I should have clarified.  Stickiness is based on the
>> source ip, so requests from the same IP will be routed to the
>> same instance.  With these applications we don't expect sessions
>> to change Ips very often if at all, but if you think it would
>> help I could stick based on the JSESSIONID cookie.

I was wondering, because there is an unfortunately situation with
session stickiness and long-lived clients where fail-over can cause a
large number of clients to switch to a particular server and /stay
there/ even if they have to re-login (when you'd likely prefer that
they get re-balanced to another node if they have to re-login).

If you had a temporary failure of one node, perhaps the clients were
re-assigned to another node and they "stayed there" when the failing
node became available again. Those clients would stay there until
either their newly-assigned node failed, or they closed their browser
(assuming they are using cookies that live for the life of the client,
like JSESSIONIDs typically do).

After the weekend in your scenario, perhaps all your users restarted
their browsers (or computers) and thus were re-balanced.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJUtSySAAoJEBzwKT+lPKRYsy4P/3i4YqxWocydSt/gWjNv+AAG
tiwzXYCd1MsVrnqy05nFjSJqJvdwCd027xPStf6O2m+VSO4M0+siRK11iupGdrFx
aRAR93RQrMViX94dVxn+QSTot+Ma0me9igZ61y/YBceTkxXTjP9WGGi2KWG4zX9/
YMQv1Wsk030jFGETQTAcEiI+LFepWuJfaoPnDtLTGJzYuA2TvMw+MTJfVeiJ+/AG
+R1fQPorfAiP8iS23J8787CZuLsuLggo6MrgRmEZbATEFk4zy3JeK0+s0DSEUZmp
lm5tWXaHgn+IybWOdxLuxv8pGDkc2nRtu/P9PFOVFfLgxpxdbstHk51V4C3UIIg7
rHfYGninpIM1RetzTMva791WEez4V+IHQM5EpcgNn5Yt6ENCQMkslYfupplK9gT+
Ieawrg4p90T8OpIW77Ir67Xv5nQ5vtjX4HdPydOmEUuO6Be0X/m65LEuBUdPMV6R
MEy47b5e0el1krojRWKZvf8/4B2ibHWApktAxww7F7TxLCSiHKuLFKnZikZrFap9
SZHK38tdOFmqYf8LR/qlDmetRhIRxOFH9VEO98EIxI11jp9TtXWGLIuBUOEBX5Pb
D+kZR4anF12vLZmvzf5nvOJuTX7tcnMaf9Btfpfhmo/fib8zrDHVOh2bhrqEbBt/
hQ4T9XbHy4GyD/NFk+z2
=a5Ju
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message