apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Pane <bp...@pacbell.net>
Subject Pool allocation bottlenecks Re: Tag 2.0.21 was Re: daedalus is back on 2.0.21-dev
Date Thu, 19 Jul 2001 19:12:45 GMT
William A. Rowe, Jr. wrote:

>From: "Justin Erenkrantz" <jerenkrantz@ebuilt.com>
>Sent: Thursday, July 19, 2001 1:06 PM
>
>
>>I wouldn't recommend using the threaded code at all because we are still
>>doing a per-process allocation mutex which causes threaded to become
>>useless.  When that is changed (i.e. we enable SMS), I think that 
>>threaded MPM will deserve to be beat up and tested.  -- justin
>>
>
>Tag and roll today, and enable SMS.  This is now a bottleneck, and no doubt
>SMS will _significantly_ help us out with the threading/locking performance
>issues.
>
It's worth noting that, for non-server-parsed content, apr_palloc
(in the original, non-SMS implementation) doesn't actually have to
acquire a lock very often.  From gprof,

                0.00    0.00     994/1588875     apr_palloc [31]
                0.00    0.00   87710/1588875     apr_file_read [5]
                0.00    0.00  500048/1588875     apr_pool_destroy <cycle 
5> [145]
                0.00    0.00  500049/1588875     free_blocks [91]
                0.00    0.00  500074/1588875     apr_pool_sub_make [143]
[87]     0.0    0.00    0.00 1588875         apr_lock_acquire [87]

The numbers mean that, out of 1,588,875 calls to apr_lock_aquire,
994 of them were from apr_palloc.

For a test using server-parsed requests, the pattern is very different:
                0.00    0.00   87710/14587902     apr_file_read [9]
                0.00    0.00 3000048/14587902     apr_pool_destroy 
<cycle 5> [22]
                0.00    0.00 3000074/14587902     apr_pool_sub_make [31]
                0.00    0.00 4000049/14587902     free_blocks [28]
                0.00    0.00 4500021/14587902     apr_palloc [27]
[13]    25.0    0.00    0.01 14587902         apr_lock_acquire [13]

Here, apr_palloc is doing a lot of locking, so thread-specific, lock-free
source of additional blocks for an SMS will help a lot.

Some thoughts based on the numbers:

  * For anybody working on tuning the SMS implementation, I highly
    recommend incorporating mod_include into your test cases.

  * Creating and destroying pools is the major bottleneck for
    non-server-parsed requests.  In order to achieve big speedups
    in the httpd, the SMS implementation needs to make sub-pool
    creation and destruction faster than the original pool design.

  * In the non-server-parsed case, apr_palloc is one of the most
    time-consuming functions in the httpd.  Keep in mind that it
    almost never (in this test case) has to acquire a lock and
    call new_block; instead, it's usually taking the fast path
    through the code that requires just a few arithmetic and pointer
    operations.  While it's probably possible to tune the code
    a bit, it's arguably close to optimal already.  What this
    means to me is that the real optimization opportunity for
    non-server-parsed content is not to make apr_palloc faster,
    but rather to stop calling apr_palloc so much.

--Brian




Mime
View raw message