I just tried out a simpler approach to fixing the mutex contention
within pool cleanups. It seems to work reasonably well, so I'm
presenting it here for feedback.
This patch does three things:
* Optimize away mutexes in the case of subrequest pool
deletion (an important special case because it's in requests
with a lot of subrequests, like SSI pages, that we see the
worst mutex overhead)
* Add support for a "free list cache" within a pool. If this is
enabled (on a per-pool basis), blocks used for the pool's descendants
will be put into this private free list rather than the global one
when the descendants are destroyed. Subsequent subpool creation
for this parent pool can take advantage of the pool's free list cache
to bypass the global free list and its mutex. (This is useful for
things like mod_include.)
* Switch to the new lock API and use nonrecursive mutexes.
(Thanks to Aaron for this suggestion. According to profiling
data, the old lock API spends literally half its time in
apr_os_thread_equal() and apr_os_thread_current().)
This patch removes essentially all the pool-related mutex operations
in prefork, and a lot of the mutex ops in worker.
--Brian
|