apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sander Striker" <stri...@apache.org>
Subject RE: Observations on fragmentation in SMS pools
Date Sun, 08 Jul 2001 12:36:28 GMT
>>> 1. The good news is that allocations of small amounts
>>>    of memory are very efficient.  They almost always
>>>    take the fastest path through the code, in which
>>>    some available space is reserved from the
>>>    "sms->used_sentinel.prev" block with a handful of
>>>    pointer arithmetic operations.
>
> Cool.

This is what the pools code did.  I only used a different way
to store the required accounting information.

>>> 2. The bad news is that allocations for larger blocks
>>>    (in the >=8KB range) typically require a call to the
>>>    parent SMS to get data.  On my test machine, I'm seeing
>>>    elapsed times in the 30 microsecond range when this
>>>    happens, compared to less than 1 microsecond for small
>>>    allocations that don't require more memory from the
>>>    parent SMS.  And when an allocation falls through to
>>>    the parent, it often seems to fall all the way through
>>>    to the root SMS (I suspect that 30us includes a malloc).
>>>    The problem seems to be particularly bad for things that
>>>    create subrequests, like mod_include.
>
> OK, not so cool.

No, a first optimization might be doing some simple defines in
the sms_private.h file which eliminates the overhead of the
framework code when apr_sms_malloc is called from inside an
sms module. The defines would look something like this:

#define apr_sms_malloc(sms, size)  sms->malloc_fn(sms, size)
#define apr_sms_calloc(sms, size)  sms->calloc_fn(sms, size)

I'm afraid we can't do the free_fn because it might be NULL.

This would get rid of a bit of overhead, but it isn't going to
make an enormous amount of difference.

>>> 3. The worse news is that there seems to be lot of
>>>    fragmentation.  For example, I saw this pattern
>>>    during a server-parsed request:
> <snip.>

>>> For an SMS that's going to be a parent of other SMSs, we'll
>>> need something with more sophisticated policies for reassigning
>>> freed space than the current trivial-SMS.
>
> Definately.
>
>> Yup.  I've brought this up to Sander and David before, but this is how
>
> Really - first i've heard of it :(

Yes, Justin brought something like this up with me when he reviewed the
"trivial" code.  We didn't discuss this usage pattern though.
Ofcourse it is very logical that it happens...

I'll think out loud now:
The solution might be adding specific allocation functions for SMS
implementations.  These functions could look something like this:

APR_DECLARE(void *) apr_sms_child_malloc(apr_sms_t *sms,
                                         apr_sms_t *child,
                                         apr_size_t size);

APR_DECLARE(apr_status_t) apr_sms_child_free(apr_sms_t *sms,
                                             apr_sms_t *child,
                                             void *mem);

Internally the framework will use the child_malloc_fn if present.  If
not, it will fall back to malloc_fn.

This is something that overthrows the KIS policy though...

>> pools work EXCEPT for the fact that we get a little more pathological in
>> SMS's addition of memory because each parent tacks on memory (as you
>> described).  The optimization here is for the <4KB allocation.  I'm not
>> sure that is what we need to be optimizing for.  More analysis is
>> probably needed to see what our typical usage pattern is.  This is where
>> Ian's pool replay code is *very* helpful (although I don't like adding
>> more #ifdef to the pool code - yuck - I'd like to see a SMS that just
>> prints that info and passes it up to the parent to do what it needs
>> to).

I want to state here that POOLS_ARE_SMS effectively makes
each pool a "trivial" sms. To take full advantage of the SMS
code the stack of SMSs should be optimized. Right now we
have this (in httpd with POOLS_ARE_SMS):

std <> trivial <> trivial [<> trivial ...]
^^^^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^
     APR              httpd

Justin already hinted in a private mail that he was thinking
about something like this:

  ... <> trivial <> tracking [<> tracking ...]

In the buckets code "blocks" could be a very good choice.

So, for a first pass at allowing httpd to run with pools
as sms is not optimal, but it gets things working.

> Yep, i had a evry simple debug sms that simply printed out what it was
> doing, then did it.  Maybe it's time to revive it, but it
> generates a LOT of output.

Another option is just using the debug code you put in and tag the
relevant SMSs. This way you can see the usage patterns in each part
of the application.

>> Before turning SMS on in the mainline code by default, I think we need
>> to come up with a more robust allocation algorithm.  And, the whole
>> point of SMS is that we can play with memory allocation algorithms
>> until we are blue in the face and nothing else changes.  You can't do
>> that with the current pools code.  We need more people playing with it
>> and suggesting new allocation algorithms.  =)
>
> Absolutely.  This was my motivation in doing the work in the first place,
> and it seems to be working.

Yes :)

I've been working on a power of 2 allocator.  I still haven't got that up
to speed yet, but it is worth a look.  I'll play with it this week and post
it to the list.

>> The trivial SMS tries to keep the memory allocation strategy as close as
>> possible to what we currently have.
>
> Oh, we're nowhere near ready to turn it on by default.  I never thought we
> were, but you need a test case to prove/disprove what's going on, and that
> what this whole pools as sms code gives us, a complex example that we can
> use to improve our code :)
>
>> Personally, I'd set MIN_FREE to 0 and bite the small allocations (now
>> they'd be slow, but really, how many allocations are under 4KB - data
>> structures, perhaps?).  I'd also like to see it smarter about reclaiming
>> free space, but that gets really tricky.
>
> Oh, if you look at it we do a LOT of allocations below 4K from pools.
Turn
> on the checking of every allocation and look at the sizes being requested.

:)

About the free space reclaiming. It is very hard to implement without
significant performance loss.  There is something else that is worth a try
and effectively does away with the entire traversal of the free list in
search of a big enough block.

We would need to implement both the used list and the free list as a
fibbonacci (hope I spelled this right) heap.  This way our biggest block
is always on top.  So, when we check the free list we only need to look
at our top block.  The reset function is doing a merge of the free and
used list.

>> It's a tradeoff (and is purposeful for lots of small allocations), but
>> until someone can write a better memory allocation algorithm, this is
>> what we got.  Trivial SMS has its Achilles heel here - this is just an
>> aggrevation of how the original pool code works (the pool code would add
>> 4KB, while each level of the SMS adds 4KB).
>
> Well, we have what we have.  My main concern about trivial is
> that the code gets complex, and im worried that we may be over
complicating
> things for ourselves.  KISS is a guiding principal that it never does any
> harm to follow
> :)

Although I agree with you that KISS is a good principle to follow, it is
nearly impossible to do so in "trivial".  Well, actually it is rather
simple,
but not very obvious.  This is mostly due to the fact that I wanted to keep
all codepaths as short as possible (there might even still be room to make
them shorter).  Memory management code tends to get complex.  This is mainly
because it is a performance bottleneck in a lot of cases.  So, it is a
trade off: complexity <-> speed.  Ofcourse it can never hurt to look for
the lowest complexity, highest speed pair ;)

>> Aaron pointed out that CLR (Cormen/Leiserson/Rivest for those that never
>> took algorithms classes) has many memory allocation algorithms.  I'd
>> bet we could find one that would work better.  -- justin
>
> The sms code was added on May 9th, which looks like less than 2
> months ago, so to be honest we're in pretty damn good shape.  Let's take
> some time, do the testing and define the problems then start looking at
> fixes.

I'm very pleased with the speed this code has been moving at and the support
that it has had.  Thanks everyone!

> Rushing in with a lot of quick solutions won't get us anywhere but a world
> of hurt.

Agreed.  The final step in the whole process is when sms is turned on by
default.  Then we can optimize the sms stacks and really take advantage of
the code.

> david

Sander


Mime
View raw message