apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Bloom <...@covalent.net>
Subject Re: freelists: a slightly different approach
Date Wed, 26 Sep 2001 20:43:42 GMT
On Wednesday 26 September 2001 12:21 pm, Justin Erenkrantz wrote:
> On Wed, Sep 26, 2001 at 02:42:14PM -0400, Cliff Woolley wrote:
> > bit alone might disqualify it.)  Oh well, it wouldn't kill us to have a
> > static array in each MPM that's the appropriate size and to just use
> > that, indexed by the second half of AP_CHILD_THREAD_FROM_ID.  As long as
> > they get stored somewhere in Apache and not in apr-util I'm happy.
> I'd rather see it stored in apr-util not httpd.  =-/
> You're trying to store internal information about the buckets in
> httpd and I'm not a fan of that (opaque pointer or not).  This code
> to store the freelists would have to be duplicated by anybody that
> uses buckets - I'd like to see flood use buckets and brigades.  As
> I said, this just seems like an incorrect way to go about it.
> It also breaks all sorts of abstractions.  IMHO, sending in a unique
> identifier (such as c->id in the case of httpd) sounds like a fair
> compromise though - that's iffy though.  I'd rather see Greg's
> suggestions of a linked-list of arrays indexed by the thread id.  I
> just think that all of this should all be hidden by the bucket code.

You can keep saying to use the thread ID, but it won't work.  Not all
platforms use an int to reference a thread_id.  We have a opaque
data type for threads for a reason.  At least two of our platforms today
do not refer to thread_id's as ints, BeOS and Windows.  Now, we may
be able to get an int out of the platform, but we don't do it today.  We
should not force platforms to provide an int.  This is what we did with
processes, and it was wrong there, it will be wrong with threads too.

If you want to use an int, it must be provided by the application using
the buckets.  In httpd's case, use the c->id.  Create a function that
allocates the correct size of an array, and just pass in c->id.  DONE!

> apr_bucket_get_my_freelist()
> {
>     apr_os_thread_t tid = apr_os_thread_current();

This is a pointer or structure on some platforms.  That simple thing
just broken the rest of this code.  Stop trying to use apr_os_thread_t
in portable code.  The os in the type means it isn't portable.

>     for all elements in the internal linked-list
>         search for free-list associated with tid by
>             apr_os_thread_equal(tid, l->tid)
>     if no match found in linked-list
>         create a new freelist for our tid
>         attach to the linked list
>     return match
> #else
>     only-one-free-list, so return it.
> #endif
> }
> Note that since we're basing it off our thread id, it is impossible
> to have any sort of collisions (threads aren't reentrant).  Therefore,
> you don't need to lock the walking of the lists.  (Actually with
> a linked-list implementation, you *may* actually have a small race
> condition as you attach the new freelist to the list and you get
> interrupted mid-statement and someone is at the very end of the
> list - ack.)

Nobody has talked about using a mutex around this code ever.

> I think a hashtable would be faster as you wouldn't have to do
> a linear search and removes the race condition above, but
> whatever - people seem to think hashes are slow.  FWIW, I'm
> picturing threads in the 100s or 1000s per process as the eventual
> typical case for httpd (flood already scales that high), so the
> balance shifts from a naive linked-list to a hashtable, IMHO.

There is no reason for either a linked-list or hashtable.  This should be a simple
array.  We are talking about 1 pointer per slot in the array, allocated once.
Assume a 64-bit platform, for a worst case scenerio, and 2000 threads per 
process, and you have 1600 bytes for the static allocation.

This works because of the way that MPMs assign the connection id.  They
all take child_num * MAX_THREADS_PER_CHILD + thread_num.  If the bucket
code always does a modulous to ensure that the number passed in is within the
array, we are perfectly safe, portable, and fast.

What is the problem with this solution?  The only argument I have heard against
it, is that it requires an integer to be passed to the bucket.  So what.  That int
is unique identifier for the process.  If we define the API to require a unique ID, 
and don't refer to the thread_num, or to any information from httpd, we keep all
of the abstractions, and it all just works.


Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net

View raw message