httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Slemko <>
Subject Re: ack re: [PATCH] performance improvement
Date Sun, 02 Feb 1997 04:42:58 GMT
On Sat, 1 Feb 1997, Dean Gaudet wrote:

> Since GET_CHAR just does a getc() and getc() is usually a macro that reads
> from a buffer... you won't gain a lot by adding another buffer. 

That's what I figured, however there is a litt more than that going on in
FreeBSD.  Worth trying.  On the same token, feof and ferror are macros on
many platforms; on FreeBSD they just:

/usr/include/stdio.h:#define    __sfeof(p)      (((p)->_flags & __SEOF) != 0)
/usr/include/stdio.h:#define    __sferror(p)    (((p)->_flags & __SERR) != 0)
/usr/include/stdio.h:#define    feof(p)         __sfeof(p)
/usr/include/stdio.h:#define    ferror(p)       __sferror(p)

I think it is still worth getting rid of them, though.

> One thing you can do is optimize find_string().  Just eye-balling the code
> it looks like find_string( "<!--#" ) is a bottleneck -- using a better
> string matching algorithm you only have to peek at every 5th character to
> see if you're in the middle of a START_SEQUENCE.  The END_SEQUENCE doesn't
> win as much, but it's still probably worth doing it (skip of 3 possible). 
> What's that algorithm called... Morris-Knuth-Pratt? 

Trouble is we don't have this in main memory to jump through anyway.  If
it were mmap()ed, this could be a definite win; then no copying at all
and we should even be able to call bwrite() without copying them out of
the mmap()ed memory.  I'm not sure this is any significant bottleneck
though; ie. it could be better, but there are things an order of magnitude
worse all over the server.

> If you're just peeking at the characters, you can avoid copying until
> you've found the tail of a "plain text" section and then do a rwrite() for
> the entire range, buffering happens beneath rwrite(), and you'll probably
> pick up enough data in plain text sections to get most of the benefit of
> your buffering approach.

Are you suggesting seeking back and forth through the file?  You still
have to copy it into memory, may as well do the copy and compare at the
same time.

A cleaner way to do my hack would be to do essentially the same thing, but
only do it in find_string().  This means you can explicitly flush before
calling anything else, so all you need are two more macros and a couple of
lines in find_string.  Hmm.

> (Hotwired's hacked up mod_include includes output buffering, and a totally
> incompatible condtional syntax, and find_string() for START_SEQUENCE was
> written sort of how I describe above, minus the 5 character skip.  It's
> about 1/3rd the speed of the default_handler() based on an entire day of
> cpu timing data.)
> Dean
> On Sat, 1 Feb 1997, Marc Slemko wrote:
> > On Fri, 31 Jan 1997, Ed Korthof wrote:
> > 
> > > On Jan 31,  9:58pm, Rob Hartill wrote:
> > > > > Yes, I realized that. <blush> Well, without that, the patch
appears to
> > > > > work fine --
> > > >
> > > > did you measure any performance gain ?. Every little helps. Was it too
> > > > little to notice though ?
> > > 
> > > Not yet -- I'll try this weekend or early next week.  We'll also be switching
> > > to the most recent beta, and will see if that leads to any performance
> > > improvements (I'll have a good idea if there was a net performance gain, but
> > > it'll be hard to identify which components contributed).
> > 
> > I hacked GET_CHAR in mod_include to have its own buffer and pull in data
> > in chunks of 5000 bytes; ie. return a byte from the buffer if the buffer
> > isn't empty, get more if it is.  This resulted in a very minimal
> > performance improvement.  The removal of the feof and ferror checks had a
> > similarily trivial improvement.  A large file (2-15 megs in size) with no
> > parsing expansions in it still gets sent at ~1/3rd the speed of a
> > non-parsed file, regardless of which of the above I tried.  I don't think
> > input buffering is the biggest problem, at least on FreeBSD.
> > 
> > By adding a rputc define to mod_include that did simple output buffering
> > in blocks of 5000 bytes (well, 1k blocks did around the same thing) and
> > then calling rwrite I was able to boost the speed so it was 2/3rds
> > the speed, again with no actual directives being parsed.  This
> > requires several other modifications to work properly.  For one,
> > all output needs to use the same buffering or things (the few things
> > that aren't sent with rputc) will get put in the wrong order.  The
> > second thing is that it needs to flush the buffer before exit.
> > Neither are overly complicated.
> > 
> > My hack to mod_include to simple output buffering:
> > 
> > #define rputc(c,r) \
> >  { \
> >   if (buffered == 5000) { rwrite(outbuf, 5000, r); buffered = 0;  } \
> >   outbuf[buffered++] = c; \
> > }
> > 
> > ...with buffered and outbuf being global variables.
> > 
> > Comments?  Ugly, but...
> > 
> > 

View raw message