httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Kew <>
Subject cache trouble (Re: [vote] 2.1.9 as beta)
Date Thu, 03 Nov 2005 10:01:23 GMT
On Wednesday 02 November 2005 20:26, William A. Rowe, Jr. wrote:
> Colm MacCarthaigh wrote:
> > I think the text "Deny from all" is a particularly dangerous thing to
> > have not work as advertised! No matter how well documented :/

Nasty.  Is it necessarily a showstopper?

> The question though, is where can Deny from all be expected to work?
> Certainly not in <Directory /foo> - the cached entity no longer lives
> there.

I disagree.  If it came from there originally, then that's where it lives.
The principle we want here is that retrieving from cache should have
the same rules from a client PoV as retrieving an original unless a
sysop explicitly says otherwise.  Breaking that principle shouldn't
hit the sysop as a standard or default behaviour.

> Perhaps in <Location /foo> - but running the full handlers, dealing with
> all the regex'es all over again defeats the purpose of running a fast
> cache.

I'm not convinced by that either.  In fact, I dislike the whole "run it in a
quick handler" principle - it runs a supertanker through the KISS principle,
and has consequently left us with a cache that never really worked.
Even if we fix this, it's sure to have a high bugrate for the forseeable
future precisely because it violates KISS.

The main purpose of caching is to relieve the pressure on a big, slow backend.
In real life, most of that bigness and slowness is pretty much always going to 
live in a handler or a backend, so what we save by running quick_handler is
inherently of secondary importance.  And there are several things we can
do about that if we make cache a normal handler:
  * provide quick versions of early hooks.  For example, an authn that looks 
    up cached headers and thus bypasses any potential trip to DBD or LDAP.
    Similarly we may be able to bypass rewriterules, content negotiation,
    or any trip to htaccess based on cache lookup matching; maybe your
    <CachedLocation> proposal.
    In a sense, that's modularising the "quick handler" concept!
  * Write a "caching performance" doc that discusses the issue, and makes
     clear the effect of anything complex in a hook that can't be bypassed.

> Certainly in <VirtualHost> ... although authnz
> doesn't work correctly there in the first place ;-)

I take it that just refers to standard per-dir-config behaviour?  A directive
that's not even syntactically valid outside a directory-context can be
forgiven for not working there!

> And certainly globally, if I ran a large mass vhost, yet knew full well
> that a list of proxies would corrupt my content, I might
>    Deny from
> but again, authn/authz doesn't work globally.

Making it do so is a feature-enhancement, not a bugfix.  And we'd need to
have a proposal for implementation without undue complexity to consider
such an enhancement.

> We can discuss 'enabling' the map to storage for <Location > and running
> the authz stack, but we would have to ensure we bypass the filesystem
> dir/files entities.  The deepest relevant level is <Location >.

In the present architecture that makes some sense.  But having chopped out
the hooks, adding them back in piecemeal seems reminiscent of Heath Robinson.

> And maybe, have you considered a <CachedLocation > / <CachedLocationMatch >
> container for mod_cache?  This would have the benefit that very long lists
> of directives would be ignored/not merged, in favor of a much shorter and
> very specific list that benefits the cache by keeping it fast, while giving
> the user the option to tweak the behavior of content, once cached.

You mean as a tool for sysops to accept/decline serving from cache?
That could potentially have merit, and would work best in a quick-translation
hook to bypass any more complex/expensive rules.
The danger is if it grows some nightmarishly confusing relationship
to normal <Location> semantics: the existing <Location> vs <Directory>
is bad enough for non-expert users!

Nick Kew

View raw message