httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruediger Pluem <rpl...@apache.org>
Subject Re: Behaviour of mod_proxy_ajp if CPING/CPONG fails
Date Sun, 07 Sep 2008 19:45:46 GMT


On 09/07/2008 12:43 PM, Rainer Jung wrote:
> Ruediger Pluem schrieb:
>>
>> On 09/06/2008 10:54 PM, Rainer Jung wrote:
>>> Rüdiger Plüm schrieb:

>>
>>> But in case the user of the connection knows, that it's broken, and
>>> closes it, it would make more sense to put in under the stack, since it
>>> is no longer connected.
>> Why? IMHO this defers only the efforts that need to be done anyway to
>> create
>> a new TCP connection. And keep in mind that even in the case that I put the
>> faulty connection back *under* the stack the next connection on top of the
>> stack was even longer idle then the one that was faulty. So it is likely to
>> be faulty as well. It might be the case though that this faultiness is
>> detected earlier (in the TCP connection check) and thus our next
>> CPING/CPONG
>> in the loop happens with a fine fresh TCP connection.
> 
> Yes, I think that's the question: if one CPING fails, what do you want
> to do with the remaining connections in the pool?
> 
> - do you assume they are broken as well and close them directly (and
> afterwards try to open a new one)
> - do you want to test them immediately (and if one of them is still OK
> maybe end the testing and use it)
> - you don't care and simply try to open a new one.
> 
> Concerning the argument that the next connection on the step would be
> even longer idle: yes, unless another thread returned one to the pool in
> the meantime.

Of course. This could have happened, but as this connection needs to be fixed
sooner or later anyway I think I should do it now.
I guess this is a matter of assumption and probability:
If you assume that a healthy connection was put back in the reslist by another
thread and that the broken connection won't be used in the near future
anyway, then the approach of getting another one from the reslist makes sense.
If you assume that you get the broken one back anyway and that it will be needed
in the near future the other approach makes sense.
As you notice I am leaning to assuming the second one :-).

> 
> My personal opinion (based on mod_jk's connection handling code): CPING
> failure is very rare. In most cases, it indicates connection drop has
> happened by a firewall, in most remaining cases the backend is in a very

I recently had a situation on one of my systems (JBOSS 4.0.x with Tomcat 5.5,
classic connector) where this wasn't true AFAICT. Both httpd
and JBOSS are on the same box so there was definitely no firewall / network
issue that caused the problem. CPING's failed with a timeout, but new
connections worked fine and I couldn't find any blocked processors threads
in the thread dump and load and GC wasn't very high. That was the starting point
for my patch as I thought that a failed CPING should not do a final verdict
on the request, but should trigger one more try. I fixed this temporarily by
a somewhat tricky LB configuration over the one backend with one retry attempt.
But to be honest, I do not know what caused this strange situation. As the
JBOSS version and thus the Tomcat version is quited aged it is possible
that there might be a bug in the classic connector that is fixed in the
meantime. But I am leaving the subject of the thread here.

> badly broken state. Both situations would result in nearly all remaining
> connections would be broken as well, but not necessarily all (in the
> firewall case, there might be less idle connections coming back from
> other threads). So a good reaction to a CPING failure would be a pool
> wide connection check and using a new connection.

In general I agree, but doing this in the scope of the request is IMHO too
time consuming and expensive.

> 
> If you are afraid, that the check of all connections in the pool takes
> to long (maybe running into TCP timeouts), you could directly try a

That is what the patch does.

> fresh connection, and set an indicator in the pool, that the maintenance
> task, which is usually only looking for idle connections to close,
> should additionally do a function check for all connections. That would
> be non-critical concerning latency for requests, once maintenance runs
> decoupled from a request (like what Mladen suggested, either in a
> separate thread or using the monitor hook).

This seems like a nice idea once we have some kind of maintenance "thread".
But I am not sure how this can be done with the current reslist implementation
because of its stack character. Keep in mind that we cannot extend the API of
the reslist until APR-UTIL 1.4.0 and cannot change it until APR-UTIL 2.0.
So this could proof to be tricky.

>> One question as you are more familiar with the AJP server code on Tomcat
>> side:
>> If a connector closes down a connection due to its idleness does it send
>> any
>> kind of AJP shutdown package via the TCP connection or does it just
>> close the
>> socket like in the HTTP keepalive case?
> 
> Unfortunately the AJP13 protocol is a little weak on connection
> handling. There is no message indicating the backend shuts down its
> connection. So it just closes the socket.

Thanks for pointing. This is all I wanted to know. If there would be unread
data (an AJP connection close packet) the detection whether the remote side
closed the socket wouldn't work.

Regards

Rüdiger

Mime
View raw message