httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject Re: Optimizing dir_merge()
Date Tue, 11 Sep 2001 04:00:09 GMT
From: "dean gaudet" <dgaudet-list-new-httpd@arctic.org>
Sent: Monday, September 10, 2001 10:39 PM


> On Wed, 15 Aug 2001, Brian Pane wrote:
> 
> > William A. Rowe, Jr. wrote:
> >
> > >Here's my take on the dir_merge patch I offered up.  Please review for sanity.
> >
> > There's one point at the end where I disagree, though it may be due to
> > a bad assumption on my part.  Here's the item in question:
> >
> > >Caching Considerations
> > >----------------------
> > >The key point within Apache is; a given per-dir config cannot be trusted after
a
> > >subsequent merge_dir_configs callback in the same pool.
> >
> > I wouldn't have drawn this conclusion.  If your per-dir config is being
> > passed as
> > the 'base' to a subsequent merge_dir_config callback, it should be
> > unharmed afterward
> > because the callback must treat the base as const.  It's okay for the
> > callback to
> > use a very liberal definition of constness if needed for performance
> > reasons; e.g.,
> > it can increment a reference count in the base in support of
> > copy-on-write logic.
> 
> writing base is a bad idea in the prefork case, because most of the base
> entries (i.e. all from httpd.conf) are in CoW regions shared with all
> other child processes/parent.  when you start writing you multiply that
> cost by the number of children.  this nails your caches with lots of
> essentially duplicated data, and you'll lose any benefit is my guess.

Well - understand first that this proposal is DOA (I'll explain in a bit.)  But to
humor the discussion - the only time that a base would be modified is IFF the pool
to merge into == the pool that base was allocated in.

This solves the problem, we would never modify a true config base (shared between 
processes), we would only overwrite a previously merged 'product'.  E.g. the configs
of dir "/", "/foo" and "/foo/bar" were allocated in the process (pconf) pool, so 
they can't be overwritten.  If we merge "/" and "/foo" into the r->pool (typical
durring a request) we create a dir conf within r->pool.  Merging that into "/foo/bar",
we -could- have overwritten without CoW issues.

> however, what you can probably get away with is putting a pointer to a ref
> counter into base... and allocate an array of ref counts.  then you're
> only duplicating that one page.

Again, wouldn't be a problem, if the pool differs, base and addv are still const.

> but actually i'm not sure why you need ref counts to begin with -- because
> i think that it is always the case that base is from an ancestor pool of
> the mergee... so base should exist at least as long as the mergee does.

Exactly.  I never proposed to modify the pconf ancestors, only siblings in the
same pool that we were targeting the merged sections into.

> > But from the perspective of the caller, passing a cached per-dir config
> > as the
> > base to a merge_dir_configs callback shouldn't change the semantics of
> > the data
> > in that per-dir config.
> >
> > What am I missing?
> 
> you're not missing anything...
> 
> the merge functions are essentially supposed to be idempotent.

And so they remain.

My location_walk optimization (which suffers a potential bug, per our svn friends)
takes an entirely different tact, which renders that whole idea DOA.

The location_walk optimization records -every- intermediate merge.  That way, we
can review this saved list for matching, successful, prior merges in our own or
our parent or internal referrer's request.  If they all match, fantastic.  If they
differ somewhat, we will take what we can get, and save, say, half the merges.

In the long run, I expect this will prove more effective, overall, than Brian's or
my original ideas for caching.  NOT that it wouldn't be good to maintain a merged
cache with a lifetime, say, of 60 seconds.  That's short enough that admins won't
be frustrated with their .htaccess sections not updating.  And that expirie time 
could be adjustable.

But for what it's worth, I like the location_walk optimization much better than 
this original 'merge into self' idea - since the subreq and internal redirects gain
a HUGE advantage of reusing much of the parent/internal referrer's earlier effort.

Dropping this optimization into dir_walk - we will never again go through the entire
directory_walk just because there is a 'directory' within a mod_autoindex listing ;)

Bill


Mime
View raw message