apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Pane <bp...@pacbell.net>
Subject Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?
Date Fri, 27 Jul 2001 07:05:02 GMT
Justin Erenkrantz wrote:

>On Thu, Jul 26, 2001 at 10:50:58PM -0700, Brian Pane wrote:
>>But there's a problem with the SMS lock management.
>>According to gprof, every call to apr_sms_trivial_malloc
>>acquires and releases a lock.
>Yup.  I'm working on this right now.  =)  (What I have in my tree 
>right now isn't in commit shape.)  Email me and I can send you
>my tree diff.
>I'm debating whether we're getting a win by the SMS stuff or not.
>In order to do that, we need to kick out every lock we can.
Yep, I've been wondering the same thing lately.
The original pool implementation of apr_palloc is
very close to optimal.  In the last benchmark that
I did with non-mod_include requests, 99.998% of the
calls to apr_palloc didn't require locking.  So it's
going to be tough to improve upon the original design.

The biggest opportunity that I see for optimization
in the memory-management framework is in sub-pool
creation.  If an SMS-based implementation can reduce
the cost of creating sub-pools, it will speed up

But first, we may have a more fundamental problem:

>Right now, I've got it so that most of the locks are now in libc
>(aka NIMBY), but the performance still doesn't match pools (by a
>factor of 2).  I'm scratching my head as to why this is.  
hmmm...looking at the code, it makes sense that SMS is
half as fast as the original pools code.  I didn't realize
this until just now, but the polymorphism in the SMS framework
will probably make it impossible to match the performance of pools:

* apr_palloc (the original pools version) is a very lightweight
  function in the fast-path case where it doesn't need to
  acquire a lock.  It consists of a couple of integer/pointer
  arithmetic operations and two comparisons.

* The SMS-based implementation has to do essentially the same
  work, but it also does an extra function call (apr_sms_malloc
  calls apr_sms_trivial_malloc).

* If the cost of a function call is similar to the cost of
  the two comparisons and half-dozen arithmetic operations
  in apr_palloc, that would explain why the SMS version is
  twice as slow.

>-- justin
>P.S. You are using gprof, how?  I tried -pg and it just doesn't
>work.  I switched to Forte 6.0U1's collect program now.  It
>actually writes out info that I can use (er_print is a bit
>awkward though).
I'm using gcc on Linux to build profiled code; it's not properly
including a profiled libc for reasons that I haven't had time to
debug yet, but it does a decent job of instrumenting the apr and
httpd code.


View raw message