apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@lyra.org>
Subject Re: Memory manager
Date Tue, 25 Jun 2002 10:09:59 GMT
On Tue, Jun 25, 2002 at 07:20:34AM +0300, Andi Gutmans wrote:
> At 03:58 PM 6/24/2002 -0700, Greg Stein wrote:
> >Um. We use pools in Subversion and free the memory all the time. The key is
> >the use of subpools. I added some notes about our experiences at the end of
> >this document:
> >
> >     http://cvs.apache.org/viewcvs/apr-serf/docs/roadmap.txt?rev=1.3
> >
> >Note that pools can also be configured to not have a per-thread lock.
> Ouch, you really worked hard there.

Hard? Not really. The pattern is not difficult to implement, and only
certain types of loops require strict subpool usage (loops which have an
input which is pretty well unbounded (based on some user input or file or

On the other side of the coin, however, is that none of our code ever is
concerned about free'ing stuff. We don't have to litter efree() throughout
our code, yet we also know that somebody will get rid of everything that we
happen to allocate [when it is appropriate].

Basically, there is a huge burden lifted by not needing to track every
allocation in the code itself.

When we *do* free (by destroying a pool), we're also getting rid of a bunch
of other, associated stuff. We never need to zero in on a particular item
and say, "get rid of *that*." All allocations come in associated groups, so
we take advantage of that and place them all into a (sub)pool.

> That is exactly what we can't do in 
> PHP. Our code base is so big that the easiest solution for us has always 
> been to just give our users the memory allocation API they are used to (in 
> our case emalloc(), efree(), erealloc() and so on) and just make sure that 
> all of this memory gets freed at the end of each request (we also have some 
> leak detection code but that is coded on top of the actual memory manager).

Understood. Of course, the problem is that if somebody gets into the habit
of, "well, it will just be tossed at the end of the request" and *stops*
using the efree() function, then you could end up with a *huge* working set.
We saw plenty of that in Subversion :-)

Tossing (groups of) memory during unbounded iteration is always necessary,
whether using pools or an alloc/free strategy. Failing that, each item that
might ever be allocated within the loop must be individually tracked by the
code which does the alloc, and then ensured that it gets freed.

> Also as PHP is a scripting language it can run for quite a bit and do lots 
> of allocation's and free's. You can't really do any planning like you guys 
> did in Subversion on exactly when stuff can be freed and when not. Grouping 
> memory allocations is virtually impossible. Anyway, it does seem that you 
> guys had to work a bit too hard.

I don't think so. While it takes some discipline, I'm not sure that I equate
that with a lot of difficulty. And your comment about "when stuff can be
freed and when not" simply tells me that your code is a bit too, um,
"unstructured" :-)

Subversion has very nice lines about when stuff is valid, and when it goes
away. Every object has a defined lifetime, and that is defined by the pool
it was placed into. We don't have destructors -- the object's death is
determined by the pool that the caller placed the object into.

Even an interpreter like PHP can be structured to have a well-defined
hierarchy of lifetimes. At the top is PHP itself. Then you have children for
each interpreter engine, maybe each thread, each time you run the compiler,
each script, etc. I'm sure there is hierarchy within that, but I'm not
familiar enough with the internals.

The "mess" only arrives once you start running the code :-)  But note that
the pools have already tossed all the memory associated with parsing and
compiling your script. Now you just have to worry about what gets allocated
as part of the interpretation process, and where that data might end up
getting stashed. Whereever the data goes... that determines the appropriate
lifetime. If somebody loads a new module into the interpreter, well that
probably sticks around, so it lives in the interp pool. Objects that are
instantiated are probably per-thread, while some data might be passed across
threads, so it lives in a data subpool of the interpreter.

etc.  The point is that object lifetimes *can* be well-designed, and the
pools simply mirror that structure. And also note a subtle benefit:
*because* of the pools, you think harder about lifetimes, and you organize
your code appropriately.

> > >...
> > > Do you guys have any interest in adding this kind of "smarter" memory pool
> > > into APR? I think it's extremely useful.
> >
> >Sure. Although I'm a bit unclear on how it differs from using, say,
> >apr_pool_destroy on a subpool to toss intermediate memory.
> If I understood correctly the difference is that you don't need to group 
> the memory but can allocate and toss memory when ever you need to. This 
> kind of "knowing in advance" can't be done in PHP.

I think it is really about granularity. pools are about grouping together
related allocations. If PHP, or its subcomponents, are trying to alloc and
free individual pieces, then yes: you need to follow that pattern and
provide alloc/free mechanisms. During Apache and Subversion development, I
just haven't found anything that ever requires that kind of granularity,

That said: *legacy* code can certainly impact the kinds of facilities that
you need to provide [to subcomponents].

Sander mentioned something about "reaps". I remember that coming up a while
back, but am not super clear on it. IIRC, it was a synthesis of pools and
being able to individually free items. Probably something right along the
lines of what you're looking for.

> P.S. - I'm enthusiastically waiting for subversion. CVS just doesn't cut it 
> anymore.

hehe :-) We're fast approaching Alpha (two weeks). Our opinion is that it
will be stable enough to use, and have all the necessary features for 90% of
your work. We'll wrap up those little bits and kick out some edge cases and
bugs between Alpha and Beta. Point is: you don't really have to wait :-)


Greg Stein, http://www.lyra.org/

View raw message