httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject [NEWTOY] flow-00
Date Sun, 07 Jun 1998 10:47:34 GMT
This is just a prototype.  But I'm already seeing 50% speed improvements
in a limited benchmark... that's on static requests across loopback (no
keepalive).  I'm comparing against apache-nspr. 

I think this technique offers even more improvement for folks building
sites which require server-side negotiation (or just multiviews mapping to
single files so you can hide the extension), authentication (in many
forms), or proxying.  I'm fairly confident that if we use this technique
then we wouldn't be restricting the generality of apache.  In fact I think
this is a more simple way to look at server configuration. 

Anyhow, 50% isn't bad for several hours of hacking...

The source is at http://www.arctic.org/~dgaudet/apache/2.0/flow-00.tar.gz.
The README is copied below.

Feel free to suggest better terminology.  I chose the term "flow" because
of a particular method of doing high speed routing with packet filtering
in which the hardware is capable of very simple pattern matching against
packets.  When a packet doesn't match any of the patterns it is punted
to software and processed "slowly".  Software hopefully will be able to
add a pattern to the hardware cache to allow future packets in the same
"flow" of data to match at the hardware level.

Dean

HTTP FLOWS

A "flow" is a pattern that maps a request to a response.  At a minimum the
pattern includes a URL.  More complex patterns are possible, such as
patterns involving the various "Accept" headers, a Cookie header, or an
Authorization header. 

Pattern matching is very simple -- only exact matches are supported.

Patterns can be broken into three categories (possibly more):

    - object -- mapping URLs to filenames
    - negotiation -- distinguishing multiple mappings for the same URL
    - authorization -- permitting or forbidding a request based on the
	presence or absence of headers

In the interest of efficiency authorization patterns will be
shared between multiple flows.  It is easiest to think of this as an
implementation of the <Directory> containers.  If an entire tree /private
is protected by Basic auth, then there will be an authorization pattern
corresponding to /private.  Conceptually the pattern is essentially
a list of permitted Authorization headers, and request containing an
Authorization header in the list is permitted.

It is not feasible in general to pre-calculate all the flows for
a website.  In fact that's not interesting.  What is interesting is
dynamic calculation of flows.  Let's call a "flow cache" a set of flows
that describe a subset of the responses for a website.  If a flow is
in the cache then we know for certain what to respond with, and we can
immediately send the response without further processing.  If a flow
isn't in the cache then we need to perform more expensive processing,
and while doing that we may add a flow to the cache to handle future
requests matching similar patterns.

Consider this httpd.conf fragment:

    DocumentRoot /www/htdocs

    <Directory /www/htdocs/private>
	BrowserMatch Mozilla/2 go_away
	order allow,deny
	allow from all
	deny from env=go_away
    </Directory>

The directory container blocks access by User-Agent's containing
"Mozilla/2".  (Or something like that, can't be bothered to look up
the syntax.)

Initially the flow cache is empty.

Suppose we get a request for "/".  The flow processor can't find any
match, so it punts.  The full request processing is performed -- this is
pretty much the same as the current Apache API.  During this process we
calculate that the url "/" maps to the file "/www/htdocs/index.html",
and there are no alternatives to negotiate, and no authorization
to restrict access.  So we insert a flow with the pattern "/ ->
/www/htdocs/index.html" into the flow cache.

Actually we wouldn't quite insert that.  If we're running on WIN32
we'd insert a pattern "/" -> file_handle, where file_handle is an open
read-only handle for /www/htdocs/index.html.  This way we could call
TransmitFile directly.  Similarly on unix we'd use mmap().  Or maybe we'd
actually map it to a database object cached in memory... it's essentially
a mapping to a handler and an object.  Also as part of this map we insert
cached content-length, last-modified, and content-type values used to
build the response headers.

Similar when we get requests for "/logo.gif", or "/warez.txt" we can
insert them into the cache.

When a request comes for "/private/" we have to build an authentication
pattern.  An auth pattern is a list of headers and the values they must
match.  Suppose this first request for "/private/" comes from a mozilla/2
client.  We build an auth pattern for <Directory /www/htdocs/private>,
indicating that User-Agent must match one of {} ... must match an empty
list.  i.e. it will never match.  We will also build the "/private/"
-> /www/htdocs/private/index.html mapping, and it will refer to the
<Directory /www/htdocs/private> auth pattern.

If we then get a request for "/private/" from a "Lynx/2.6 libwww-FM/2.14"
client, it won't match in the flow cache.  But we will permit
it during the full request processing... and then we can insert
"Lynx/2.6 libwww-FM/2.14" into the User-Agent list for <Directory
/www/htdocs/private>.  Subsequent requests from that same user agent
to /private/ will match in the flow-cache and be served without full
processing.

We can continue this way, adding urls such as "/private/part.gif",
and "/private/memo.html".  They all share the same auth pattern, which
means they share updates to the list of permitted User-Agents.

Similar techniques can be used to build negotiation patterns.  We take
the simple approach in every case.  Rather than cache complex expressions
(such as the list of all negotiation possibilities for an object, with
q-values and whatever else), we cache the exact set of headers which are
acceptable and match them.  Anything which doesn't match goes through
the full processing (and that will insert a record to help match the
next time).  Similar to auth patterns, some effort should be put into
"coalescing" negotiation patterns.

For example, Navigator sends the same Accept/Accept-Language headers
on almost every request (the exact contents can be modified by the
user...).  We only need to recognize that set of headers, and then
any server negotiation involving those headers can refer to the
common pattern.

It's my feeling that a large percentage of requests fit into the above
mold.  There's some obvious extensions too, such as negative caching --
for example, a pattern denying an IP address could be a map to an error
handler and an error code.

WHY WHY WHY?

Why do we want to do all this?  Speed.  The prototype implementation
here is 40% to 50% faster than apache-nspr... this was with a hardwired
test across loopback, but without keep-alive.

Here's two reasons I can think of for the speed increase:

- Simplistic hand coded parser.  The parser doesn't handle all cases,
  it shouldn't have to.  It bails to the full request parser in any
  case it can't handle.  In fact it's so simple that it doesn't
  even have buffered i/o -- if the request isn't in the first packet
  sent from the client it bails and passes it down to full request
  parsing.

- Fewer threads.  The prototype has two threads.  One accepts
  connections, reads requests, and parses them.  The other handles
  sending responses.  Fewer threads means fewer context switches.
  But more importantly it means less memory required for things such
  as stacks... and that results in less L1/L2 cache impact... which
  can have a dramatic effect on performance.

CAVEATS

The prototype has a pre-seeded flow cache.  In practice there are
locking issues involved with seeding the flow cache.  Using a message
queue will allow us to batch updates and hopefully incur less locking
overhead.

The prototype and the discussion above don't deal with cache coherency.
This is an area I'm still working on.  A rough first approximation is
to flush the entire cache periodically.  There's a bunch of other
options... each pattern has different criteria for coherency.

Obviously doing this will require API changes, and configuration language
changes.  The above example would be extremely difficult to "compile"
into the form required for the flow cache.  But I think we can make a
user-friendly language that is easily compilable into flows on the fly.


Mime
View raw message