httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <>
Subject Re: 2.0.24 tagged.
Date Thu, 16 Aug 2001 20:51:27 GMT
From: "Bill Stoddard" <>
Sent: Thursday, August 16, 2001 3:33 PM

> > > My issue is that table merges are a o*n problem, while hash merges are an o*n2
problem :(
> >
> > Hash merge is O(mn) in the worst case (the case when your hash function completely
> > :-).  The merge will tend to O(mlogn).

That varies based on the number of directory merges, as well, and how large the base table

> Okay, I've thought about this some.  Your veto of my patch is unreasonable.

Hmmm?  Perhaps, but I committed effectively the same code an hour ago, because I'm not going
to defend my veto, nor contribute any further to this clueless discussion.  If my discussion
points on optimizing weren't clear, then yell.  Someone rewrite them who is clearer than I.
But let's talk to the underlying argument that is: "When is a dir_merge( base ) not const?"

I'll let the numbers decide if we continue to use tables or hashes without this optimization.

> We should commit the simpified hash table patch I poster earlier. The hash implementation
> will clearly outperform tables during lookups.  Hash merges are O(mlogn) and one of the
> hash tables is typically small so my conclusion is that the merge operation will probably
> not contribute significant overhead (as compared to a table merge), especially in the
> cases where merges occur often (like when we open .htaccess files, where the overhead
> file io dominates the transaction).


We are merging biglonglist with nicesmalladdition...

Because we merge, biglonglist must be recreated, and refilled one element at a time, expanding
(potentially several times) as it goes.  Doesn't matter how simple the nicesmalladdition is,

this is a huge performance pig.  One very _simple_ optimization in apr_hash would be an 
apr_hash_dup function that would really memcpy() the existing hash table (into the same sized

allocation), it may or may not be that simple.  Someone can try a patch.

Reiterating; the lhs hash isn't just memcpy'ed.  It's built again from scratch.

> The simplfied hash patch is better than tables, IMHO, so it should go in.

And since only real numbers will tell us anything, I've committed the safety patches to
mod_mime.c to duplicate not only the hash table whenever we munge it, but also copy the exinfo

members whenever we munge one of them.

A very _simple_ optimization would test that _this_dir_merge_ created the exinfo, so we don't
copy it three times if we perform three different merge/remove operations on the same record.
Again, anyone can feel free to offer a patch.

> I am not opposed to better implementations but we should not introduce wicked complexity
> for a bit extra performance, especially without benchmarking.

The hash is wickedly complex.  If I had the energy to recreate my patch, it adds a small
handful of lines.  But agreed that benchmarking is the question.  mod_mime merges are now
and using hashes.  Benchmarking will prove it up or down. Either which way, I don't have any

more energy for this, it's sapped a good week of my life, as it stands, on an issue 
(table -> hash) that I had 0 interest in :(


View raw message