tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Mulder <>
Subject Re: Distributed Session Server
Date Thu, 29 Jun 2000 02:19:53 GMT
	Contrast this to the situation where the session server goes down
and 10,000 out of 10,000 sessions are lost.  Or if not lost altogether, at
least disabled until the server is restarted (which translates to the user
as lost - "please come back and try again tomorrow" is not all that
helpful).  It seems to me like you're moving from a redundant situation to
a single point of failure.  You might argue that the session server is
less likely to go down, but this would be hard to guarantee (it's all
"complex software" to me!).  You might argue that you can use redundant
session servers, but this is just throwing more overhead at the problem
(you could equally well put that effort into more stable or fault-tolerant
web servers).
	If you're going to try to make the session server bulletproof (a
write-through cache of session data to persistant, transactional storage
or some such), why not put that effort into the web server?  You'd get
better performance (no serialization).  You could certainly get a traffic
manager that would pass users on dead servers over to live servers, and
the live servers could then load the session data for the new users from
storage, or whatever.


On Wed, 28 Jun 2000, Matthew Dornquast wrote:
> There is no magic bullet, a well designed web application must gracefully
> handle errors.  In my mind, the goal is to isolate session data from the
> webserver/tomcat processes.  Here is why in greater detail to avoid
> confusion:
> _Imagine 10 web servers:
> Each server has 1,000 active sessions
> Each server has a load of 30 concurrent connections at any give moment
> Total user population 10,000
> Total transactions in flight at any given moment 300
> _There are two scenarios with session data
> a)They are distributed across the farm, so webserver #1 handles processing
> for 10% of the user population for each of the webservers.  That's 100 for
> server #1, 100 for #2... and so on
> b)All session data is stored in a central, fault tolerant one.  All 10,000
> sessions are stored in this server.
> [[ Now imagine Webserver #1 tanks, your redirector detects this, takes #1
> out of rotation. ]]
> There are now three categories of users:
> i) User had http transaction in flight with #1 when it failed.  There are 30
> of these.
> ii) User had http transaction in flight with #2-#10 when #1 failed, there
> are 270 of these.
> iii) User did NOT have an http transaction in flight on any server, but they
> are in an active session.  There are 9,700 of these.
> So here is the outcome given #1 tanks for two architectures:
> (a)Sticky Sessions:
> a.i) 30 users see result of interrupted data stream/timeout.  User will
> probably hit refresh or back arrow and submit again to try and recover.  27
> of these users will experience a positive refresh, 3 will not as their
> session data was on #1 which is now out of rotation.
> a.ii) 27 of the 270 users will experience a break in their servlet
> processing because execution was actually being carried out on server #1,
> some type of graceful error message will occur (hopefully).  However,
> refreshing will not solve problem as all session data for them is now lost,
> they will have to start over.
> a.iii) upon submission 970 of the 9,700 users will experience a graceful
> error message.  Their session data is now lost, they will have to start
> over.
> Summary for a:
> We'll have 57 users that witness interruption out of the in flight 300
> 1,000 users will have lost their session data and have to start over.
> (b)Session Server:
> b.i) 30 users see result of interrupted data stream.  User will probably hit
> refresh or back arrow and submit again to try and recover.  30 of these
> users will experience a positive refresh (if servlets are designed with
> transaction manager).
> b.ii) 270 users were not aware of server #1 being down.  They are not
> affected.
> b.iii) 9,700 users are not aware of server #1 being down.
> Summary for b:
> We'll have 30 users that witness interruption of service out of the flight
> of 300
> No users will have to start over.  (Assuming some sort of transaction broker
> or careful planning.  Without careful planning, a small percentage will have
> corrupt session data and experience some sort of graceful error message. ~ 3
> users)
> Disclaimer:
> I know, this is totally contrived, but not all that far from reality for
> some of us is it?  Certainly, it is plausible.
> And let's not forget about the downsides for (b).  Like when your session
> farm goes and you loose all your sessions for everybody!
> And lastly, I really like sticky session.  It works well for many
> situations.  Most of them I'll warrant.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message