httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <>
Subject Re: [NEWTOY] flow-00
Date Sun, 07 Jun 1998 19:39:03 GMT

On Sun, 7 Jun 1998, Ben Laurie wrote:

> Dean Gaudet wrote:
> [snip loads of cool stuff]
> > We can continue this way, adding urls such as "/private/part.gif",
> > and "/private/memo.html".  They all share the same auth pattern, which
> > means they share updates to the list of permitted User-Agents.
> Thus makes me nervous. How do you know they share the same auth pattern?

Hmm.  Ignore .htaccess files for now (not too hard to pull into this
scheme).  Let U be the set of all dir/loc/file containers.  Remember that
dir/loc/file have an ordering.  During merge we construct a subset S of U. 
Since the containers are ordered, S uniquely identifies the containers
that apply to this request... and any other request that has the same S
can share the same auth pattern. 

We need an efficient representation for S. 

For .htaccess files we parse and cache them as we see them (this is
caching in the back-end webserver).  This lets us assign them to the
universe U, and lets us build auth patterns based on them.  If the
htaccess cache needs to have a record removed, we need to update the flow

> Suppose I have a module installed that does cunning auth based on the
> URL in a way that doesn't use Directory/Location et al.? Or that
> processes any headers in a non-orthogonal way.

Well the flow cache contains only URLs that have been seen before.  It
would be the module's responsibility to either negatively cache the URL
(indicating it should always bounce to full processing), or to cache the
URL along with an auth pattern that has exactly the set of allowed

If a request comes in with a header Foobar that's not listed in the auth
pattern it's bounced... so suppose we have a way for the core to ask the
question "hey, anyone interested in the Foobar header?"  If nobody speaks
up, the core can put the header into the auth pattern saying "any value
acceptable".  This is what I expect to happen by default with things such
as User-Agent.

> And - small cache. At what point does the cost of searching the cache
> exceed the cost of evaluating things in the standard way?

Right.  I'm not sure.  All strings are opaque objects in this cache --
since patterns are simple exact matches (or a wildcard "anything matches")
we can aggressively hash the strings.  The flow "engine" doesn't need to
analyse any string contents, it only needs to hash things.  So it's really
just a standard string -> integer lookup problem.

Suppose we used B trees (or extensible hashes, doesn't matter).  What I'm
thinking is to have a single B tree for each header that contains the
union of all the values in all the auth patterns (this is an optimization
to save memory, localize data, and amortize the cost of tree maintenance).
The B tree is a mapping from string to unique integer.  Then the auth
patterns are each a set of acceptable integers. 

But you're right, this is where I need to improve the prototype next... 


View raw message