apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Pane <bp...@pacbell.net>
Subject [PATCH for review and benchmarking] new implementation of apr_table
Date Sun, 14 Jul 2002 01:59:04 GMT
Traditionally, most apr_table operations have run in O(n)
time.  All my previous optimizations were focused on reducing
the constant factor (e.g., by replacing string comparisons with
integer checksum comparisons).  But table ops are still slow,
as seen in recent httpd benchmarks, and I've run out of ways
to reduce the constant factor.  Thus I'm now studying ways
to reduce the O(n) to something sublinear.

This patch adds a simple index to apr_table_t.  The index maps
each character to the locations in the table of the first and
last elements whose keys start with that character.  (To be
completely accurate, I'm ANDing the first character of the key
with a mask that removes capital/lowercase differences and
fits the resulting value in the 0-31 range to keep the index
manageably small.)

In the worst case, this doesn't help at all.  If you have
a table consisting of ten thousand instances of key "foo"
followed by one entry with key "foobar," apr_table_get()
will have to do 10,001 strcasecmp operations.

In the common case, though, the indexing appears to work well.

Using httpd-2.0 as a benchmark, the indices reduce the run
time of apr_table_get() by 39% (from an average of 43.6
cycles per call to 26.6 on a Sparc, including the cost of
calls to strcasecmp()).

The one clear drawback to the indexed implementation is
that we have to do an O(n)-time rebuild of the index if
apr_table_unset() or apr_table_set() removes any elements.
In the httpd, at least, this doesn't appear to matter: the
query speedup provided by the indexing outweighs the cost
of the occasional index rebuilds.

Does anyone have additional benchmarks that they want to run
before I commit?


View raw message