httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <>
Subject Re: mod_proxy / mod_proxy_balancer
Date Wed, 06 May 2009 08:03:47 GMT
Caution: long response!

On 05.05.2009 22:41, jean-frederic clere wrote:
> Jim Jagielski wrote:
>> On May 5, 2009, at 3:02 PM, jean-frederic clere wrote:
>>> Jim Jagielski wrote:
>>>> On May 5, 2009, at 1:18 PM, jean-frederic clere wrote:
>>>>> Jim Jagielski wrote:
>>>>>> On May 5, 2009, at 12:07 PM, jean-frederic clere wrote:
>>>>>>> Jim Jagielski wrote:
>>>>>>>> On May 5, 2009, at 11:13 AM, jean-frederic clere wrote:
>>>>>>>>> I am trying to get the worker->id and the scoreboard
>>>>>>>>> logic moved in the reset() when using a balancer, those
>>>>>>>>> need a different handling if we want to have a shared
>>>>>>>>> information area for them.
>>>>>>>> The thing is that those workers are not really handled
>>>>>>>> by the balancer itself (nor should be), so the reset() shouldn;'t
>>>>>>>> apply. IMO, mod_proxy inits the generic forward/reverse workers
>>>>>>>> and m_p_b should handle the balancer-related ones.
>>>>>>> Ok by running first the m_p_b child_init() the worker is
>>>>>>> initialised by the m_p_b logic and mod_proxy won't change it
>>>>>> Yeah... a quick test indicates, at least as far as the perl
>>>>>> framework is considered, changing to that m_p_b runs 1st in
>>>>>> child_init
>>>>>> results in normal and expected behavior.... Need to do some more
>>>>>> tracing to see if we can copy the pointer instead of the whole
>>>>>> data set with this ordering.
>>>>> I have committed the code... It works for my tests.
>>>> Beat me to it :)
>>>> BTW: I did create a proxy-sandbox from 2.2.x in hopes that a
>>>> lot of what we do in trunk we can backport to 2.2.x....
>>> Yep but I think we should first have the reset()/age() stuff working
>>> in trunk before backporting to httpd-2.2-proxy :-)
>> For sure!!
>> BTW: it seems to me that aging is only really needed when the
>> environment changes,
>> mostly when a worker comes back, or when the actual limits are changed
>> in real-time during runtime. Except for these, aging doesn't seem to
>> really add much... long-term steady state only gets invalid when the
>> steady-state changes, after all :)
>> Comments?
> I think we need it for few reasons:
> - When a worker is idle the information about its load is irrelevant.
> - Being able to calculate throughput and load balance using that
> information is only valid if you have a kind of ticker.
> - In some tests I have made with a mixture of long sessions and single
> request "sessions" you need to "forget" the load caused by the long
> sessions.

Balancing and stickyness are conflicting goals. Stickyness dictates the
node, once a session is created, balancing tries to distribute load
equally, so needs to choose the least loaded node.

In most situations aplications need stickyness. So balancing will not
happen in an ideal situation, instead it tries to keep load equal
although most requests are sticky.

Because of the influence of sticky requests it can happen that
accumulated load distributes very uneven between the nodes. Should the
balancer try to correct such accumulated differences?

It depends (yeah, as always): what we actually mean by "load" is varying
on the application. Abstractly we are talking about resource usage. The
backend nodes have limited resources and we want to make optimal use of
them by distributing the resource usage equally.

For some applications CPU is the limiting resource. This resource is
typically coupled to actual requests in flight and not to longer living
objects like sessions. Of course not all requests need an equal amount
of CPU, but as long as we can't actually measure the CPU load, balancing
the number of requests in the sense of "busyness" (parallel requests)
should be best for CPU. Because CPU monitoring is often done on the
basis of averages (and not the maximum short term use per interval),
some request count acumulation as a basis for the balancing will result
in better measured numbers (not necessarily in better "smallest maximum
load"). If we do not age, then strongly unequal historic distribution
(caused by stickyness) will result in an opposite unequal distribution
as soon as a lot of non-sticky requests come in. I think that's not optimal.

Other applications are memory bound. Memory is needed by request
handling but also by session handling. Data accumulation is mor
eimportant here, because of the sessions. Again, we can not be perfect,
because we don't get a notification, when a session expires or a user
logs out. So we can only count the "new" sessions. This counter in my
opinion also needs some aging, so that we won't compensate historic
inequality without bounds. I must confess, that I don't have an example
here, how this inequality can happen for sessions when balancing new
session requests (stickyness doesn't influence this), but I think
balancing based on old data is the wrong model here too.

Then another important resource is bandwith. Here we are more concerned
about the amount of transferred data. Although in all real cases I know
the limiting bandwith is always in front of the web server and not
between the web server and the backend, this would be a case to consider
at least theoretically. Here again stickyness conflicts with optimal
balancing, so we can get into a very inequal distribution which should
not be compensated in the future, so aging seems appropriate.

Finally the number of backend connections (and depending on the backend
connector the number of threads needed for them on the backend) is often
a limiting resource. That would be best handled by "busyness", which
doesn't need aging, because it is not an accumulating counter but
instead a snapshot number. "busyness" does behave somewhat unexpected
under low load though (when the measured busyness is nearly always "0").

> The next question is how do we call the ageing?
> - Via a thread that calls it after an elapsed time.
> - When there is a request and the actual time is greater than the time
> we should have call it.

Since the numbers one would like to age are global over all Apache
children, one needs to use a global mutex in the second case. Another
detail: mod_jk models the aging as dividing by 2 once a minute. Of
course the factor and the interval could be varied. When doing that
coupled with request handling, it is division by 2^n, where n is the
number of times the interval passed since the last request (yeah, that's
not relevant, when there is load, but it is relevant, when people start
testing without stress testing tool by issuing single clicks).

As soon as the watchdog is assumed to be fully accepted, I think the
first option should be considered.

Finally: the data used to decide on the balancing decision should be
kept separate from statistical data used to monitor the usage of the
nodes. Those data can accumulate without aging. Making the decision data
avilable via monitoring (balancer manager) additionally helps in
understanding the correctness of the balancer.



View raw message