Mailing-List: contact dev-help@apr.apache.org; run by ezmlm
Precedence: bulk
Message-ID: <3CE15B2A.6090003@cnet.com>
Date: Tue, 14 May 2002 11:44:58 -0700
From: Brian Pane <brian.pane@cnet.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
 rv:0.9.9) Gecko/20020311
MIME-Version: 1.0
To: cmpilato@collab.net
CC: dev@apr.apache.org
Subject: Thoughts on fixing the APR pool problems  Re: Help.
References: <JLEGKKNELMHCJPNMOKHOAELAFNAA.striker@apache.org>
 <x7d6vyzojc.fsf@pascal.ch.collab.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

cmpilato@collab.net wrote:

>Just bringing this little dialogue into the public eye.
>
>"Sander Striker" <striker@apache.org> writes:
>
>>>From: cmpilato@collab.net [mailto:cmpilato@collab.net]
>>>Sent: 13 May 2002 02:08
>>>
>>>I'm trying to piece something together here regarding Issue #622.  I
>>>did a checkout of a copy of the subversion repository's /trunk (at
>>>revision 1600-and-something) over ra_local, with pool debugging turned
>>>on, and watching the process in `top'.  The `top' output showed the
>>>svn process crawling steadily upwards in terms of memory usage,
>>>finishing up at around 30M by the time my checkout completed.
>>>However, the pool debugging output showed that we maxxed out our pool
>>>usage at 2.29M.  The pool debugging output *looks* accurate to me,
>>>since the whole checkout process is a bunch of recursion and looping,
>>>all of which is very "subpool-informed", and I've gone over this
>>>process pretty thoroughly.
>>>
>>>What makes the actual footprint of the program so different in terms
>>>of memory used?  Are we leaking non-pool memory somewhere?  Is the
>>>pool code simply not re-using allocations?
>>>
>>The latter is indeed the case.  The production pools code does very
>>little to reuse mem.  It is a space-time tradeoff.  There have been
>>several patches to improve on mem reuse, but since there hasn't been
>>a single project using pools that could benefit from these patches
>>they've been lost in the archives.  Maybe now is a good time to reevaluate
>>patches that ensure better reuse.
>>
>>The reason Apache can get away with this is because apache has either
>>shortlived pools or relatively small allocations.  And ofcourse when pools
>>were invented they were tuned for Apache...
>>

We really need to provide a more general-purpose memory management
solution for situations where pool semantics aren't a good match for
an application.

I can think of two solutions:
  * Extend the apr_pool implementation to support freeing of blocks.
    I.e., add an upper bound on the size of the allocator free lists,
    and add an apr_pfree() function to free apr_palloc()ed blocks
    within a long-lived pool.  (What I'm thinking of here is something
    like the "reaps" design proposed by Emery Berger et al,
    ftp://ftp.cs.utexas.edu/pub/emery/papers/reconsidering-custom.pdf)

  * Or turn apr_pool_t into an abstract class, with concrete implementations
    to implement different allocator models (with the traditional 
Apache-style
    pool as the first of these).  In order to do this without impacting
    performance, we'd probably have to do a macro-based implementation:

      - Each pool object has a struct at the top that contains pointers
        to the pool's alloc function, free function (possibly null), cleanup
        function, etc

      - apr_palloc(p, size) becomes a macro:
            #define apr_palloc(p, size)  (*(p->alloc_fn))(p, size)
        And similarly for apr_pfree()

--Brian