Return-Path: X-Original-To: apmail-httpd-dev-archive@www.apache.org Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0D988FC9A for ; Thu, 11 Apr 2013 13:38:39 +0000 (UTC) Received: (qmail 86176 invoked by uid 500); 11 Apr 2013 13:29:13 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 85729 invoked by uid 500); 11 Apr 2013 13:28:57 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 85167 invoked by uid 99); 11 Apr 2013 13:28:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Apr 2013 13:28:48 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.83.41] (HELO mail-ee0-f41.google.com) (74.125.83.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Apr 2013 13:28:41 +0000 Received: by mail-ee0-f41.google.com with SMTP id c1so785683eek.0 for ; Thu, 11 Apr 2013 06:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maven.pl; s=maven; h=x-received:from:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id; bh=HmUxFKFXInKoUsUPX7t1yuwULCYmJL6k1oaOAIpx1vU=; b=QrIagncs+Q9ApKPKVTCNlJZ9KdqQa1fvKiKe+lQv2YDwHEBQ6I0pDyXl+ARAeKzKsc WiEF96AHjtWlOakUbIXMqWMRUAxXDkoI7zrNAzSGpEoqMxPghzWbHskSP92JTAZfI8yO /ZA0DlMZcOME3ebPvVXW6d+7AJABgv2pYjceI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id:x-gm-message-state; bh=HmUxFKFXInKoUsUPX7t1yuwULCYmJL6k1oaOAIpx1vU=; b=AmtArW5gJt13TEo6iMF8dRykDj/d3vw8jLnesThwfgjZVGw5/a0CKMro2wSBPyAYj0 iHSknrUx6RnbpI/gF/ZSBJTO7hhCu3G6AlwKJs1zsROk+Vdl6HWEs4M/j2A4DXhphmTQ CXolkQXqrSJOvqXRWmOUWxLaCGPVa3F5C6u4PI0G1SgS95i+iA76uRn4g7OkHDCQnpWv YXTJwbHVgvfenc02c2GdcHc8UrQnFr5aAl25FqGs2znFJjBSWFU7SNaR27TmunBDmOhp sgij35nETS3nScSJQJeJpTP52y+WY3mFE39n862JrgqP2MzMAhHJ71CyBTZN2ezSJNjq oJpQ== X-Received: by 10.15.76.132 with SMTP id n4mr17060402eey.16.1365686900688; Thu, 11 Apr 2013 06:28:20 -0700 (PDT) Received: from t400.localnet ([91.234.176.244]) by mx.google.com with ESMTPS id f47sm5592894eep.13.2013.04.11.06.28.19 (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Apr 2013 06:28:20 -0700 (PDT) From: Arkadiusz =?utf-8?q?Mi=C5=9Bkiewicz?= To: dev@httpd.apache.org Subject: Re: 2.4.4 graceful restart taking long time Date: Thu, 11 Apr 2013 15:28:18 +0200 User-Agent: KMail/1.13.7 (Linux/3.8.6-dirty; KDE/4.10.2; x86_64; ; ) Cc: "=?utf-8?q?Pl=C3=BCm?=, =?utf-8?q?R=C3=BCdiger?=, Vodafone Group" References: <201304111343.03985.arekm@maven.pl> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <201304111528.18891.arekm@maven.pl> X-Gm-Message-State: ALoCoQkr27MPHXHECHRsqg9fka5FMxCBsUwAeFZk8kFuyEe56EglBQlUFhmR/IEa3fy+HAZU8USf X-Virus-Checked: Checked by ClamAV on apache.org On Thursday 11 of April 2013, Pl=C3=BCm, R=C3=BCdiger, Vodafone Group wrote: > > On apache 2.2.22, 2.2.23 and 2.4.4 I'm able to reproduce a problem > > where graceful restart takes very long time. Linux 3.7.10, glibc 2.17 > > here. > > Example strace of main httpd process while doing graceful restart: > > http://pastebin.com/QFH5TjT6 >=20 > From the strace Here is similar strace but without ab running, one second is enough http://pastebin.com/HKjxxP2p (StartServers 64 this time, ab -c 64) > it looks like the connect takes a lot of time later on (the > poll is waiting 1 second for the connect to complete).=20 That strace was done with StartServers 128 but ab was using -c 64, so looks= =20 like idle children responded fast (that first second), busy - too slowly. Another strace including children http://ixion.pld-linux.org/~arekm/apache1.txt Graceful restart and first OPTIONS write at=20 12594 15:10:01.397356 write(7, "OPTIONS * HTTP/1.0\r\nUser-Agent: Apache/2.= 4.4=20 (Unix) (internal dummy connection)\r\n\r\n", 83 but first read is long after write 13279 15:10:52.636685 <... read resumed> "OPTIONS * HTTP/1.0\r\nUser-Agent:= =20 Apache/2.4.4 (Unix) (internal dummy connection)\r\n\r\n", 8000) =3D 83=20 <0.000014> Huh? That's even after resuming operations mesage which was a bit earlier: 12594 15:10:52.606253 write(2, "[Thu Apr 11 15:10:52.606236 2013]=20 [mpm_prefork:notice] [pid 12594] AH00163: Apache/2.4.4 (Unix) configured --= =20 resuming normal operations\n", 137) =3D 137 <0.000009> Compare that to case where ab wasn't running: http://ixion.pld-linux.org/~arekm/apache-no-ab.txt write OPTIONS in main process and read in childrens are interleaving as=20 expected. Now why these aren't interleaving in apache1.txt where ab was running? > As the accept call > on httpd side only returns when the first data is send on the socket, IMHO > the time the poll takes does take place in the kernel and not in the httpd > children code. Well, I think kernel is not the one to blame. > Have you checked your messages file if the kernel reports > something when this happens? How does your run queue and CPU load look > like when this happens (top)? No kernel messages, no cpu spikes - looks normal. > Regards > R=C3=BCdiger =2D-=20 Arkadiusz Mi=C5=9Bkiewicz, arekm / maven.pl