tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <>
Subject Re: http request (no only session) replication in cluster
Date Wed, 12 Jun 2013 14:10:59 GMT
Hash: SHA256


On 6/11/13 5:04 PM, Ja kub wrote:
> requirement is system should be possible to process 160 req/sec
> (200 is better to multiply) and system is kind of failover proxy
> itself
> there are 2 backing webservices, each can answer max 20s, it there
> is timeout on first, I must call the second, if there is timeout on
> second I send soap fault to client, so usually it shouldn't be more
> than 20s per req, guys say that normally it is 7-10
> seconds/request, but in worst case it is 2*20s*160 requests/s ~=
> 6400 pending requests (and according to deal we must fulfill worst
> case)

If you have 2 member nodes and one of them starts to slow down, then
you'll see pretty much all requests re-tried on the second node, which
will slow down that one. I think you'll end up seeing a storm of
requests bouncing back and forth.

Worse, the initial request will continue processing on the 1st node,
ignorant of the fact that the lb has given up and tried the other
node. It's just going to fall apart from there.

Honestly, this should be able to be handled at your lb -- can't you
set a time-out there?

> even if there are so many requests they are pending on sockets, I
> try to do it with NIO, asynchronous servlets and async cxf - both
> async cxf webservice is exposed by me, and I also call backing ws
> with async cxf I think even one tomcat on one server should be able
> to serve such 6400 pending requests with 160req/s, apart from proxy
> there are also 4-6 inserts into database (cli req, resp; 1st ws
> call, resp; 2nd ws call, resp
> how do You assess such architecture/attitude ? do You expect
> problems with async exposed webservice based on async servlet and
> NIO, and async cxf ws client ? afaik cxf use thread locals, are
> they all right with tomcat async servlets ? (I don't define
> threadlocals by myself, only cxf possibly does)

It's not a socket-resource issue, it's a raw work-load issue: you have
a large amount of load and it looks like you can't handle it very
well. I would recommend more nodes, first, and then seriously consider
whether re-trying on a second node is appropriate if the first node
takes too long.

What you should probably do is actually profile your code to find out
what is taking so long. Using tricks like ThreadLocals can shed
microseconds off of a request, not whole seconds.

You might want to consider whether you can do less work during a
request -- perhaps split a single transaction into more than one. Or,
just acknowledge that sometimes a transaction can take 10-20 seconds
(or 50?) and manage the clients' expectations.

You also need to find out where your bottleneck is: RDBMSs, slow
disks, slow network links, etc. can all be much more significant than
things like software stack and exact implementation of your code. If
you are missing an index on a relational table, transactions that
should take a second or two can take tens of seconds.

Start there: profile your application, find out what is slow, and fix
that. Don't try to work-around the problem with surprising
transactional re-tries, because they likely won't work the way you
hoped. Hey, once you fix your performance problem, perhaps you won't
need additional hardware. Also, your users will be very happy to see a
speed improvement.

- -chris
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools -
Comment: Using GnuPG with Thunderbird -


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message