Return-Path: Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 37822 invoked by uid 500); 9 Nov 2001 08:06:50 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 37811 invoked from network); 9 Nov 2001 08:06:50 -0000 From: "Mladen Turk" To: Cc: "Brian Pane" Subject: RE: [PATCH] apr_table WAS: What to do about tables? Date: Fri, 9 Nov 2001 09:05:57 -0800 Message-ID: <000e01c16940$d09873c0$5f00000a@armada> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 In-Reply-To: <3BEB6BC0.5060005@pacbell.net> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Importance: Normal X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N > -----Original Message----- > From: Brian Pane [mailto:bpane@pacbell.net] > Sent: 8. studeni 2001 21:38 > To: dev@httpd.apache.org > Subject: Re: [PATCH] apr_table WAS: What to do about tables? > > Mladen Turk wrote: > > [...] > > >I've treated the apr_table as database table, having columns and rows. > >When you think of apr_table that way, you can use the same technique > >that the databases use to quickly access particular row, using some sort > >of indexing. > > > Also, why add another hash table implementation? Apr_hash_t won't > quite work here, because it doesn't have efficient support for > case-insensitive string keys, but I'd rather add that support > into apr_hash_t than add another version of hash tables to APR. > The current apr hash table is inappropriate for such a design and too much overhead. It stores the key/value data pairs by itself. If you look deeply in my code it's not a true hash table, but rather an array of indexes that are accessed using hashing scheme. > Based on my really quick reading of your new version of apr_tables.c, > the code looks reasonable. My main worry is that the overhead of > the hash table maintenance, including the function call overhead and > the need to reindex every time the table is expanded, may offset the > performance benefits of hashing. Do you have any benchmark data > to compare the performance of this implementation with the current > apr_table_t? old apr_table my implementation (microseconds) add 10000 keys 40.057 80.113 get 10000 keys 7.220.166 60.084 The keys are in the form KEY0...KEY9999 So I agree that you have almost double time to insert the keys, but getting values from table shows 120 times performance gain. There are number of ways to further optimize my code, for example, the simple table scan would beet any indexing technique for a small number of items. Other would be to optimize reindexing, or expressively do an indexing after substantial table update (for example after reading headers or mime types). MT.