tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <rainer.j...@kippdata.de>
Subject Re: 2 second delays in mod_jk while "maintaining workers"
Date Tue, 07 Sep 2010 12:51:23 GMT
On 07.09.2010 12:40, Rainer Jung wrote:
> On 07.09.2010 12:32, John Baker wrote:
>> Rainer,
>>
>> On Tuesday 07 September 2010 11:09:46 you wrote:
>>> I don't like "socket_timeout" ...
>>>
>>>> worker.basic.socket_timeout=90
>>>
>>> but I would like socket_connect_timeout.
>>>
>>> The next two are possibly a bit short, because if the backend e.g. does
>>> a Java Garbage Collection which miht take longer than 1 second, tose
>>> timeouts will fire and take the node out of load balancing.
>>>
>>>> worker.basic.connect_timeout=1000
>>>> worker.basic.prepost_timeout=1000
>>
>> I've change socket_timeout=90 to socket_connect_timeout=9000, and
>> connect_timeout/prepost_timeout to 5000, and the problem persists:
>>
>> [Tue Sep 07 11:29:22.863 2010] [364:47921619424608] [debug]
>> jk_shutdown_socket::jk_connect.c (722): About to shutdown socket 152
>> [Tue Sep 07 11:29:24.863 2010] [364:47921619424608] [debug]
>> jk_shutdown_socket::jk_connect.c (813): Shutdown socket 152 and read 0
>> lingering bytes in 2 sec.
>> [Tue Sep 07 11:29:24.868 2010] [364:47921619424608] [debug]
>> wc_maintain::jk_worker.c (339): Maintaining worker lb-jboss51-integration
>
> OK, so it would be helpful to see a more complete log, e.g.
>
> grep 364:47921619424608 LOGFILE | egrep -e '11:29:2[012345]'
>
> If there's data in it which is not meant for public posting, you can
> also send it to me va direct mail.

Thanks for the log snippets. I can now see:

- the socket close *does* happen during maintain
- this wasn't obvious from the previous log snippert, because more 
"Maintain" lines for other workers were left out

Some relevant lines are:

[Tue Sep 07 11:29:22.863 2010] [364:47921619424608] [debug] 
wc_maintain::jk_worker.c (339): Maintaining worker lb-...
...
[Tue Sep 07 11:29:22.863 2010] [364:47921619424608] [debug] 
ajp_reset_endpoint::jk_ajp_common.c (757): (worker1) resetting endpoint 
with sd = 152 (socket shutdown)
...
[Tue Sep 07 11:29:22.863 2010] [364:47921619424608] [debug] 
jk_shutdown_socket::jk_connect.c (722): About to shutdown socket 152
...
[Tue Sep 07 11:29:24.863 2010] [364:47921619424608] [debug] 
jk_shutdown_socket::jk_connect.c (813): Shutdown socket 152 and read 0 
lingering bytes in 2 sec.
...
[Tue Sep 07 11:29:24.863 2010] [364:47921619424608] [debug] 
ajp_maintain::jk_ajp_common.c (3190): cleaning pool slot=0 elapsed 185 in 2
[Tue Sep 07 11:29:24.863 2010] [364:47921619424608] [debug] 
ajp_maintain::jk_ajp_common.c (3197): reached pool min size 1 from 1 
cache slots
[Tue Sep 07 11:29:24.863 2010] [364:47921619424608] [debug] 
ajp_maintain::jk_ajp_common.c (3237): recycled 1 sockets in 2 seconds 
from 1 pool slots
...

So now we can discuss, how to fix this.

- Can we detect more reliably, whether we want to drain before closing 
or not

- As a workaround: using a JkWatchdog moves the maintain into a separate 
thread. But during the socket closing a lock is held, which blocks other 
threads from accessing the same worker.

We would need to rewrite the logic in ajp_maintain() to only hold the 
lock (critical section) while pulling sockets out of the ep_cache or 
entering them back, not while actually using the sockets.

I still don't know, *why* the socket close blocks, i.e. why the backend 
doesn't respond to us wanting to close the socket. Here a network sniff 
like indicated by Mladen would help.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message