httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <>
Subject Re: A couple of Apache architecture questions
Date Mon, 21 Sep 1998 18:47:18 GMT
On Mon, 21 Sep 1998 wrote:

> > Dmitry Khrustalev has done a perfect hash for things like headers,
> > including "interning" the constants we use (i.e. "Content-Type",
> > "Content-Length", ...) so that we only need to compare pointers... and I
> > believe it's a win, but he didn't post results.  It requires a few more
> > API changes though. 
> Headers would be the biggest win I think.  When all these sploits got released, they
> all mention that the box goes 100% CPU and stops responding shortly.

That had nothing to do with hashing -- it had to do with header merging,
a problem which exists with or without hashing.

Furthermore hashing is still O(n), it just has a smaller constant.  If a
r00t kiddie wants to, they can exploit your hash function and you'll be no
better than you were without the hash.

> code = Hash(name)
> if (table[code]) {
> 	if (table[code]->num_hdrs < MAX_HEADERS) {
> 		table[code]->num_hdrs++;
> 		temp = table[code]->tail;
> 		temp->next = header;
> 		header->head = temp;
> 		table[code]->tail = header;
> 	}
> } else {
> 	table[code]=header;
> }

This is less efficient for the common case than what we do already -- I
would wager it's a loss if you benchmark it with the common clients. 

> There is a downside to hashing, one that I don't fully understand how to fix, and that
> is resizing the hash table.  Could someone elighten me how this is done in convention?

Now you're talking about much more complex data structures that have
properties and timings closer to balanced trees than to hash tables and
linear lists.  Look for "extensible hashing" in a cs text book for one
such data structure.

It helps to remember that most tables in apache have less than 5 elements.
Even the header tables have less than 10 or so in all the common cases.
There's not much room to optimize...  you would probably have to do
an adaptive structure which used linear lookups for small tables and
something more complex for large tables.

Also, take a look at the use of ap_overlap_tables in 1.3.2 -- it deals
with a bunch of the "slow" aspects of linear lists when performing
bulk operations.

> I will be writing a number of support functions, I'll try to convince my boss to let
> me release them to the Apache group for inclusion into the core util library.

Sounds good, thanks.


View raw message