Return-Path: Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 3225 invoked by uid 500); 24 Sep 2001 00:10:25 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 3214 invoked from network); 24 Sep 2001 00:10:25 -0000 Date: Sun, 23 Sep 2001 20:09:36 -0400 (EDT) From: Cliff Woolley X-X-Sender: To: Ryan Bloom cc: Subject: Re: bucket free lists (was Re: [PATCH] buckets sms phase 2) In-Reply-To: <20010923215739.B998246DFC@koj.rkbloom.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Sun, 23 Sep 2001, Ryan Bloom wrote: > Mostly good. I did notice a few ommissions that will be necessary. > > apr_status_t apr_bucket_list_init(int num_threads) > > This function creates the static array. We need to tell them how many > threads will be creating pools. We'll also need the apr_bucket_alloc(int list) and apr_bucket_free functions, whatever they're called. I left all these out for now because I just wanted to be sure that the other part was what you had had in mind. > I assume we'll be removing the apr_bucket_*create_in functions once > the code is finished. We can. But if we're going to do that then let's forget all this #define business and just go ahead and change the API. That saves us changing the names to _create_in and then changing them back. > I am trying to decide if we should pass in a pool or an integer. I > think we need to at least associate a pool with each array slot. > That way, the buckets will always be allocated out of a pool, so that > they are guaranteed to be cleaned up properly. The reason to not pass > a pool each time, is that we want to thread pool to be the pool > associated with array slot, not the request pool. > > So, I think we will also need: > > apr_status_t apr_bucket_list_pool(int num, apr_pool_t *p) > > Feel free to change that name. I've been debating this whole allocate-out-of-the-thread-pool thing. If some request manages to generate 8192 1-byte buckets before freeing any of them, then we have forced the amount of allocated memory for that thread to be roughly 8192*80 ~= 640KB and to stay that way for as long as the thread lives. I don't think that's what we want. Perhaps if we created a subpool for each slot and allocated the buckets out of those? We could then clean up the pool at certain intervals [once a connection, once every ten connections, it doesn't matter] and start over again... of course that would require some other API function to do it (apr_bucket_list_cleanup(int num) or something like that). What do you think? Alternatively, of course (and I'm kind of leaning this way), we could just keep using malloc and basically just use the logic from the (former) blocks SMS, which after all was written expressly with the buckets in mind anyway. It behaves a lot like pools except it gives you the free-list capabilities and already has the cleanup feature built in. I know you really want everything to use the same allocator, of course... > I am not sure about using the value from apr_thread_current to do this. > That function just returns a tid from the OS, so it unlikely to be really > useful. The core should just use the thread id, which can be retrieved > cleanly from the conn_rec->id field, using AP_CHILD_THREAD_FROM_ID. Okay, you've convinced me that we don't need this _list_find() thing. Therefore we also don't need the _create_in() stuff. We can just expect the caller to pass us a number unique to that thread. We can always give them the hint that apr_os_thread_current() is one way to create a number unique to each thread in the documentation, of course, though we should also mention that they probably just want to find that unique number once and cache its value somewhere. Single-threaded apps can of course just use 0 or some other constant; it's better to let the caller decide that than for us to do it and base the decision on whether APR_HAS_THREADS or not. For instance, in prefork we could always use a constant number, but apr-util has no way to know that prefork won't ever have buckets in more than one thread since APR is still threaded. > That will be faster, because we don't need to call into the thread libs, and > it is more likely to work cleanly. Yep. My intention (even when we were talking about using sms's) was always that this 'key' in whatever form it takes would get stashed in the conn_rec. It's convenient that the id is already in there... though you can't quite use AP_CHILD_THREAD_FROM_ID, because that macro returns two numbers separated by a comma. Just the second half of it is what we want (that's c->id%HARD_THREAD_LIMIT for mpmt MPM's, c->id for spmt MPM's, and 0 for prefork), but we'd have to make a new macro to get just the second half. My only question is this: if we use a static array, then we expect the caller to pass us an index basically; a number between 0 and num_threads. Fine. But what if num_threads isn't constant? What if in some application it's hard to determine the index from what information you do have (thread ID) for some reason? Something in the back of my mind tells me we might have to use a hash table instead of a static array so that we can account for cases like this, even though a hash table would be less efficient. The extra benefit would be that the key the caller gives us can really be any value at all, as long as that value is unique to that thread. They don't have to worry about shoehorning the value into the range 0..num_threads. I fear that if we use a static array we're tying this too closely to httpd, and more than that to current MPM's for httpd. Thoughts? --Cliff PS: Why is conn_rec->id a long and not an APR type? -------------------------------------------------------------- Cliff Woolley cliffwoolley@yahoo.com Charlottesville, VA