httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graham Leggett" <>
Subject Re: mod-cache-requestor plan
Date Mon, 11 Jul 2005 10:15:25 GMT
Parin Shah said:

> When the page expires from the cache, it is removed from cache and
> thus next request has to wait until that page is reloaded by the
> back-end server.

This is not strictly true - when a page expires from the cache, a
conditional request is sent to the backend server, and if a fresher
version is available it is updated, otherwise the existing cache contents
are left alone. Place was left in the original cache design for serving
multiple requests of the same non-fresh URL without fetching the backend
URL many times, but this has not yet been implemented.

The option to guarantee freshness of the cache is a very useful feature

> Here is the overview of how am I planning to implement it.
> 1. when a page is requested and it exists in the cache, mod_cache
> checks the expiry time of the page.
> 2. If (expiry time – current time)  < Some_Constant_Value,
> then mod-cache notifies mod_cache_requester about this page.
> This communication between mod_cache and mod_cache_requester should
> incur least overhead as this would affect current request's response
> time.

There are two approaches to this:

- Cache freshness of an URL is checked on each hit to the URL. This runs
the risk of allowing non-popular (but possibly expensive) URLs to expire
without the chance to be refreshed.

- Cache freshness is checked in an independant thread, which monitors the
cached URLs for freshness at predetermined intervals, and updates them
automatically and independantly of the frontend.

Either way, it would be useful for mod_cache_requester to operate
independantly of the cache serving requests, so that "cache freshening"
doesn't slow down the frontend.

I would vote for the second option - a "cache spider" that keeps it fresh.

> 3. mod_cache_requester will re-request the page which is soon-to-expire.
> Each such request is done through separate thread so that multiple
> pages could be re-requested simultaneously.

Once mod_cache_requester has decided that a URL needs to be "freshened",
all it needs to do is to make a subrequest to that URL setting the
relevant Cache-Control headers to tell it to refresh the cache, and let
the normal caching mechanism take it's course.

Putting the subrequests into separate threads isn't necessarily a good
idea, as you don't want to put a sudden simultaneous load onto the backend
server, or take up too much processing power of the frontend itself. You
also probably want to keep things simple.

> This request would force the server to reload the content of the page
> into the cache even if it is already there. (this would reset the
> expiry time of the page and thus it would be able to stay in the cache
> for longer duration.)

The cache code should already do this.

> Please let me know what you think about this module. Also I have some
> questions  and your help would be really useful.
> 1.what would be the best way for communication between mod_cache and
> mod_cache_requester.  I believe that keeping  mod_cache_requester in a
> separate thread would be the best way.

mod_cache_requester will need access to the backend caches so that it can
query freshness. This is done through hooks made available for mod_cache
to do the same thing.

Firing off a separate thread/process for mod_cache_requester can be done
when the server starts up and the module is initialised, however keep in
mind some of the limitations of threads and processes:

- If the platform supports threads, then you can monitor the disk cache,
the memory cache, and the shared memory cache.
- If the platform supports processes, then you can monitor the disk cache
and shared memory cache only.

> 2.How should the mod_cache_requester send the re-request to the main
> server.

You fire off a subrequest to an URL, and throw away the data that comes back.

For some example code, look at mod_include.

> 3.Other than these questions, any suggestion/correction is welcome.
> Any pointers to the details of related modules( mod-cache,
> communication between mod-cache and backend server) would be helpful
> too.

Keep in mind that mod_cache is a framework, into which sub-modules are
plugged to do the work of the backend caching.

mod_cache_requester would probably be a submodule of mod_cache, using
mod_cache provided hooks to query elements in the cache.


View raw message