Return-Path: Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 76561 invoked by uid 500); 2 Jan 2003 21:53:18 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 76548 invoked from network); 2 Jan 2003 21:53:18 -0000 Message-ID: <3E14B4D3.9030802@cnet.com> Date: Thu, 02 Jan 2003 13:53:23 -0800 From: Brian Pane User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1b) Gecko/20020722 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@httpd.apache.org Subject: Re: [PATCH] remove some mutex locks in the worker MPM References: <4E88DDB5-1DD6-11D7-B5B4-000393B3C494@clove.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Aaron Bannert wrote: > The patch looks good at first glance. Have you done any testing > to see how much it improves performance (on UP and MP machines) > and if it has any effect when APR is build with generic atomics? Here are the performance numbers that I have. I ran httpd-2.1.0-dev on an 8x167MHZ CPU Sun with Solaris 8 (32-bit mode). The client driver sent a fixed number of concurrent requests for a 1-byte file (to keep the time spent in network writes from overshadowing the results). I tested with both the SPARC V8+ native atomic ops and APR's mutex-based default atomics: 50 clients load %CPU req/s standard 2.1.0-dev worker 4.59 0.53 1090 patched w/native atomic ops 4.57 0.53 1097 w/mutex atomic ops 4.70 0.52 1093 100 clients load %CPU req/s standard 2.1.0-dev worker 4.74 0.54 1070 patched w/native atomic ops 4.69 0.54 1083 w/mutex atomic ops 4.61 0.54 1067 Basically, the patch results in a slightly higher throughput with lower CPU load. That matches what I'd expect from a reduction of mutex contention. With the mutex-based fallback implementation of the apr_atomic API, performance was slightly worse than the original worker code in the 100-client case, but faster in the 50-client case. (The effect of using my worker patch with the mutex-based atomics is to increase the number of lock calls while reducing the amount of time spent in each critical region. These two effects seem to counteract each other.) Brian