tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Pospisil <>
Subject Re: mod_jk maintenance, recovery
Date Fri, 11 Jan 2008 16:10:26 GMT
Hi Rainer,

thanks a lot for prompt response.

> Dominik Pospisil wrote:
> > Hello,
> > I am having following problem with following failover test scenario.
> >
> > Cluster setup:
> > - 1 apache load balancer
> > - 2 nodes with equal LB factor
> > - sticky session turned on
> > - Apache/2.0.52, mod_jk/1.2.26
> >
> > Test scenario:
> > 1. start 1st node
> > 2. start load driver
> > 3. start 2nd node
> > 4. wait for state transfer (2 minutes)
> > 5. kill 1st node
> >
> > My experience is that after stage 1 and 2, all clients are handled
> > correctly by 1st node and the second node is set correctly to ERR state.
> > After while, the second none switches to ERR/REC state.
> >
> > However at stage 4 (after starting 2nd node) the second node will never
> > come up to OK state. I have set both worker maintain period and LB
> > recovery_time to 30s. So i guess that in 2 minutes, the second node
> > should have been re-checked. When I press manually "Reset worker state"
> > button, it comes up immediatelly, but it never happend automatically
> > during maintenance phase.
> I would expect, that your load driver only send sticky requests, i.e.
> requests with eiher cookie or URL encoding for node cluster01. At least
> that would fit to your observation.
> mod_jk detects during maintenance, if a worker was in error state long
> enough to try again. This happens in your setup, as you can see by the
> ERR/REC state. The next request that comes in *and does not contain a
> session id of another node* will be routed to the REC node. Under load,
> you won't see this state often, because most of the time it should turn
> into ERR or OK very quick.

Hmm, I see. But I would not agree that under load this would necessairly turn 
ERR or OK quickly. I am also generating heavy load, just there are no 
clients / sessions arriving. This could happend in real screnario too.

> Maybe your app sets a cookie and the load driver always presrnts that
> cookie. That way all further requests would be handled as sticky and
> routed to the first node.
> You can find out by logging %{Cookie}i in your httpd access log. If you
> include this in your LogFormat, you can see the incoming Cookie header
> for each request.

Yes, that's true, all of my clients are initialized at the beginning of the 
test and corresponding sessions are created. 

> > Eventually, after killing 1st node, and after returning couple of "503
> > Service Temporarily Unavailable" exceptions, mod_jk finally recheck 2nd
> > node status, reroute requests to 2nd node and resumes correct operation.
> >

Still, I it is not clear to me, why I am getting 503 exceptions. I believe 
that when there is one or more servers up and ready to serve request, this 
should not happend. Why the requests are not immidiatelly rerouted to second 
node, which is couple of minutes up and running (in ERR/REC state) ?



To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message