httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject Re: remaining CPU bottlenecks in 2.0
Date Wed, 05 Sep 2001 04:56:48 GMT
From: "Justin Erenkrantz" <jerenkrantz@ebuilt.com>
Sent: Tuesday, September 04, 2001 11:46 PM


> Based on the patches you submitted (and my quasi-errant formatting
> patch), I had to read most of the code in mod_include, so I'm more 
> familiar with mod_include now.  I do think there are some obvious 
> ways to optimize find_start_sequence.  I wonder if we could apply 
> a KMP-string matching algorithm here.  I dunno.  I'll take a look 
> at it though.  Something bugs me about the restarts.  I bet that 
> we spend even more time in find_start_sequence when a HTML file 
> has lots of comments.  =-)

You were discussing the possibility of parsing for <!--# as a skip by 5.

Consider jumping to a 4 byte alignment, truncating to char and skip by
dwords.  E.g., you only have to test for three values, not four, and you
can use the machine's most optimal path.  But I'd ask if strstr() isn't
optimized on the platform, why are we reinventing it?

This is DSS for little endian (that char bugger comes from the first byte
so skipping in 0-3 is not an issue) but for big endian architectures you 
need to backspace to the dword alignment so you don't miss the tag by
skipping up to 6 (wrong by 3, and then reading fourth byte of dword.)

That has got to be your most optimal search pattern.

Bill


Mime
View raw message