httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arkadiusz Miśkiewicz <ar...@maven.pl>
Subject Re: 2.4.4 graceful restart taking long time
Date Thu, 11 Apr 2013 13:28:18 GMT
On Thursday 11 of April 2013, Plüm, Rüdiger, Vodafone Group wrote:

> > On apache 2.2.22, 2.2.23 and 2.4.4 I'm able to reproduce a problem
> > where graceful restart takes very long time. Linux 3.7.10, glibc 2.17
> > here.
> > Example strace of main httpd process while doing graceful restart:
> > http://pastebin.com/QFH5TjT6
> 
> From the strace

Here is similar strace but without ab running, one second is enough
http://pastebin.com/HKjxxP2p (StartServers 64 this time, ab -c 64)

> it looks like the connect takes a lot of time later on (the
> poll is waiting 1 second for the connect to complete). 

That strace was done with StartServers 128 but ab was using -c 64, so looks 
like idle children responded fast (that first second), busy - too slowly.

Another strace including children
http://ixion.pld-linux.org/~arekm/apache1.txt

Graceful restart and first OPTIONS write at 

12594 15:10:01.397356 write(7, "OPTIONS * HTTP/1.0\r\nUser-Agent: Apache/2.4.4 
(Unix) (internal dummy connection)\r\n\r\n", 83 <unfinished ...>

but first read is long after write

13279 15:10:52.636685 <... read resumed> "OPTIONS * HTTP/1.0\r\nUser-Agent: 
Apache/2.4.4 (Unix) (internal dummy connection)\r\n\r\n", 8000) = 83 
<0.000014>

Huh?

That's even after resuming operations mesage which was a bit earlier:

12594 15:10:52.606253 write(2, "[Thu Apr 11 15:10:52.606236 2013] 
[mpm_prefork:notice] [pid 12594] AH00163: Apache/2.4.4 (Unix) configured -- 
resuming normal operations\n", 137) = 137 <0.000009>

Compare that to case where ab wasn't running:
http://ixion.pld-linux.org/~arekm/apache-no-ab.txt
write OPTIONS in main process and read in childrens are interleaving as 
expected.

Now why these aren't interleaving in apache1.txt where ab was running?

> As the accept call
> on httpd side only returns when the first data is send on the socket, IMHO
> the time the poll takes does take place in the kernel and not in the httpd
> children code.

Well, I think kernel is not the one to blame.

> Have you checked your messages file if the kernel reports
> something when this happens? How does your run queue and CPU load look
> like when this happens (top)?

No kernel messages, no cpu spikes - looks normal.

> Regards
> Rüdiger

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

Mime
View raw message