Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 27089 invoked from network); 2 Mar 2004 15:53:31 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 2 Mar 2004 15:53:31 -0000 Received: (qmail 71315 invoked by uid 500); 2 Mar 2004 15:53:19 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 71273 invoked by uid 500); 2 Mar 2004 15:53:19 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 71259 invoked from network); 2 Mar 2004 15:53:19 -0000 Received: from unknown (HELO ganymede.hub.org) (24.222.46.208) by daedalus.apache.org with SMTP; 2 Mar 2004 15:53:19 -0000 Received: by ganymede.hub.org (Postfix, from userid 1000) id F0C3B34504; Tue, 2 Mar 2004 11:47:55 -0400 (AST) Received: from localhost (localhost [127.0.0.1]) by ganymede.hub.org (Postfix) with ESMTP id EFCBD34037 for ; Tue, 2 Mar 2004 11:47:55 -0400 (AST) Date: Tue, 2 Mar 2004 11:47:55 -0400 (AST) From: "Marc G. Fournier" To: dev@httpd.apache.org Subject: Re: FreeBSD 4.x and Apache2: worker MPM issue In-Reply-To: <2147483647.1078180247@localhost> Message-ID: <20040302113616.K23774@ganymede.hub.org> References: <20040228220424.S82441@ganymede.hub.org> <2147483647.1078009181@[10.0.1.91]> <20040229122848.A82441@ganymede.hub.org> <2147483647.1078045404@[10.0.1.91]> <20040229153545.M82441@ganymede.hub.org> <2147483647.1078092202@[10.0.1.92]> <20040301112516.R776@ganymede.hub.org> <2147483647.1078180247@localhost> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Mon, 1 Mar 2004, Justin Erenkrantz wrote: > What we believed was that it was related to race conditions inside the > OS scheduler handler where our poll calls got mixed up with the > scheduler's polls. We had it tracked down to some gnarly stuff inside > the libc_r scheduler and gave up... Note that *BSD is looking at a 4.10 RSN, and I'm trying to fight for trying to get this fixed, if its possible, which is why I'm trying to come up with some data to fight with ... Is there anywhere that there is a summary of this "gnarly stuff"? Something you could point me at, that I can use/question? > > so am I mis-configuring something? I'm running the default httpd.conf, > > and the worker stuff is setup as: > > > > > > StartServers 2 > > MaxClients 150 > > MinSpareThreads 25 > > MaxSpareThreads 75 > > ThreadsPerChild 25 > > MaxRequestsPerChild 0 > > > > > > so I would have expected no more then 4 processes to be running, no? > > Well, I'd expect it to be no more than 6 (150 / 25). But, yah, I'm not making > sense of your 'ps auxl' output either. Is it possible that FreeBSD is showing > the threads as processes? That'd make the count about right if there is only > one process. (Linux used to do that, but I forget *BSD's behavior.) > > I also know that you must have two worker processes to trigger it. You may > need to set 'MinSpareThreads' to 50 to ensure that you always have two > processes up. If you look in STATUS ("FreeBSD, threads, and worker MPM" > entry) that is the other pre-requisite. k, changed it to: StartServers 3 MaxClients 150 MinSpareThreads 50 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 ps auxl (shows the parent id so that I can find all children) shows: pluto# ps auxl | grep 20098 root 20098 0.0 0.1 4516 2912 ?? Ss 11:44AM 0:00.06 /usr/local/sbin/ 0 20098 1 0 2 0 4516 2912 poll Ss ?? 0:00.06 /usr/local/sbin/httpd www 20101 0.0 0.1 6408 3056 ?? S 11:44AM 0:00.03 /usr/local/sbin/ 80 20101 20098 0 2 0 6408 3056 poll S ?? 0:00.03 /usr/local/sbin/httpd www 20102 0.0 0.1 6408 3056 ?? S 11:44AM 0:00.02 /usr/local/sbin/ 80 20102 20098 0 2 0 6408 3056 poll S ?? 0:00.02 /usr/local/sbin/httpd www 20103 0.0 0.1 6408 3056 ?? S 11:44AM 0:00.01 /usr/local/sbin/ 80 20103 20098 0 2 0 6408 3056 poll S ?? 0:00.01 /usr/local/sbin/httpd which is what I would expect ... now, running http_load with a rate of 2 (simple), I'm still left with those three processes ... 19 fetches, 1 max parallel, 28823 bytes, in 10.0108 seconds 1517 mean bytes/connection 1.89795 fetches/sec, 2879.19 bytes/sec msecs/connect: 157.772 mean, 189.588 max, 152.671 min msecs/first-response: 173.843 mean, 244.612 max, 160.396 min HTTP response codes: code 200 -- 19 increase it by 10x, still three, and a telnet/GET after still responsive: 192 fetches, 9 max parallel, 291264 bytes, in 10.0123 seconds 1517 mean bytes/connection 19.1764 fetches/sec, 29090.6 bytes/sec msecs/connect: 162.432 mean, 228.586 max, 152.396 min msecs/first-response: 174.894 mean, 221.808 max, 159.584 min HTTP response codes: code 200 -- 192 increase it to 50, jumped to four, then went back down to three, and a telnet/GET after still responsive: > http_load -rate 50 -seconds 10 /tmp/urls 433 fetches, 77 max parallel, 656861 bytes, in 10.003 seconds 1517 mean bytes/connection 43.2871 fetches/sec, 65666.5 bytes/sec msecs/connect: 443.495 mean, 3411.55 max, 251.228 min msecs/first-response: 417.855 mean, 3750.21 max, 269.668 min HTTP response codes: code 200 -- 433 Great, so either it did get fixed at some point, and nobody has acknowledge it, or I'm really doing something wrong trying to trigger it :( that last one, if I read the notes on http_load, would have hit it 500 times in 10 seconds, which should have simulated a good load, I would have thought? ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664