httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Akins <>
Subject mod_cache wishlist
Date Tue, 23 Aug 2005 12:42:48 GMT
With the talk of "direction" of Apache, I though I, as an end user and 
developer, would offer my "wish list" for mod_cache.  We have been using 
squid for various things, but are now mostly using Apache plus our own 
custom cache module.  Our module has grown to support most of the "cool" 
features of squid we liked and many other features.  Unfortunantly, I 
can't just post the code :(

I would like to standardize on the "stock" Apache mod_cache for various 
reasons (bug fixes, mind share, etc.)  However, we now have services 
that depend upon the features of our cache module.  I think some of the 
ideas we have implemented are useful to others as well.

I am willing to code if there is hope of some of it being committed.  We 
would really like to use stock mod_cache and are willing to submit 
patches so that it can meet our requirements.

Finally, the stock mod_cache is about 15% slower that our module in 
files around the 10k range.  With larger files, it easily fills a gig 
interface.  However, with very small files (<4k), it slows down 
considerably, I am working to find the source of the slow down.

Anyway, here's my list:

My wish list for mod_cache.c

Deterministic temp files to avoid "thundering herd":

Especially Colm's comments:

Content definitely should not be served from the cache after it has
expired imo. However I think an approach like;

     if((now + interval) > expired) {
         if(!stat(tmpfile)) {

ie "revalidate the cache content after N-seconds before it is due to be
expired" would have the same effect, but avoid serving stale content.



This necessary in the case of reverse proxies, not that useful for 
normal proxies.


Replace current CacheEnable/Disable with a more squid like approach:


Configurable cache statuses.  I know, for example, that I want to cache 
404's and 302's in my reverse proxy setup.


Ability to query cache objects from other modules.  For example,

apr_size_t len;

ap_cache_query(r, CACHE_SIZE, &len);

would give me the size of the cached object (headers + data).  and 
something like:

char **file;
ap_cache_query(r, CACHE_HEADERS_FILE, &file);

Would give me the name of the file on disk for the headers file for 
mod_disk_cache, but NULL for mod_mem_cache.c

Interface could be similar to libcurl's curl_easy_getinfo():


Add a hook that gets called after store_body so that other modules can 
track cache usage.  The other modules could use the mythical 
ap_cache_query() I described above to update cache usage in a database, 
file, dbm, shared memory, etc.  A great use would be if this information 
was in a database (using mod_dbd, perhaps) so that an admin (or script) 
could selectively expire and purge cache entries.  This could also lead 
to a much more efficient version of htcacheclean that did not have to 
crawl the directory tree.


Another idea from squid:

Pre-make all cache directories in mod_disk_cache (post_config or 
auxiliary script).  Eliminates some overhead and simplifies the code. 
Use directories like 0/0, 0/1, 0/2, ... z/y, z/y.

Also, have vary just regenerate a new key.  The make a .vary directory 
is ugly.


Faster!  Some patches that gave us speed improvements:

only cache req_hdrs if vary:

speed up read_table:

Brian Akins
Lead Systems Engineer
CNN Internet Technologies

View raw message