httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jess Holle <je...@ptc.com>
Subject mod_proxy/mod_proxy_balancer bug
Date Tue, 14 Apr 2009 21:12:50 GMT
proxy_handler() calls ap_proxy_pre_request() inside a do loop over 
balanced workers.

This in turn calls proxy_balancer_pre_request() which does

    (*worker)->s->busy++.

Correspondingly proxy_balancer_post_request() does:

        if (worker && worker->s->busy)
            worker->s->busy--;

Unfortunately, proxy_handler only calls proxy_run_post_request() and 
thus proxy_balancer_post_request() outside the do loop.  Thus the "busy" 
count of workers which currently cannot take requests (e.g. that are 
currently dead) increases without bound due to retries -- and is never 
reset.

Does anyone (i.e. who is more familiar with this code) have suggestions 
for how this should be fixed?  If not, I can take a swing at it.

Similarly, when retrying workers in various routines in 
mod_proxy_balancer.c those worker's lbstatus is incremented.  If the 
retry fails, however, the lbstatus is never reset.  This issue also 
leads to an lbstatus that increases without bound.  Just because a 
worker was dead for 8 hours does not mean it can handle all the work 
load now.  It needs to start fresh -- not 8 hours in the hole.  This 
issue also creates an unduly huge impact when doing

    mycandidate->s->lbstatus -= total_factor;

We're seeing the load balancing be thrown dramatically off in this case.

Does anyone have suggestions for how this should be fixed?  If not, 
again I can take a swing at this, e.g. reseting lbstatus to 0 in 
ap_proxy_retry_worker().

It *seems* like both of the issue center on handling of dead workers, 
especially having a multiple dead workers and/or workers that are dead 
for long periods of time.

I've not yet checked whether mod_jk (where I believe these basic 
algorithms came from) has similar issues.

--
Jess Holle


Mime
View raw message