Message-Id: <m0rwuor-0006jHC@cass00.ast.cam.ac.uk>
Date: Thu, 6 Apr 95 17:54 BST
From: drtr@ast.cam.ac.uk (David Robinson)
To: new-httpd@hyperreal.com
Subject: Re: Harvest cache now available as an ``httpd accelerator'' (fwd)
Content-Length: 1646
Sender: owner-new-httpd@hyperreal.com
Precedence: bulk
Reply-To: new-httpd@hyperreal.com

Brian wrote:
> There are two possibilities I see here:
>
>1) Create a generalized caching system for any file that pays attention 
>to everything it *needs* to - content negotiation, authentication, 
>modification times, scripts, everything.  This is a big undertaking, one 
>that's difficult to get right and will mean the code base will get very 
>big.
>
>2) Have an interface for manual selection of files to cache.  I.e., the 
>administrator specifies a list of files (or subdirectories, etc) that 
>they know are not susceptible to authentication or content negotiation or 
>server-side includes, and when httpd fires up it loads those (few) pages 
>into memory.  This works on the principle that a sizeable percentage 
>of web hits come from a limited number (10-20) of pages and inlined 
>images.  A stat() should still be run for every access to in-memory files 
>since they could change or whatnot.  

I would prefer 1, on the grounds of ease of administration. It need not be
as bad as you make it sound; you can always not bother caching the difficult
cases.


>The question for both is - is this really going to be faster on a machine 
>with good I/O characteristics and large disk caches?  All we're really 
>saving by using an in-memory cache is a read from disk - is this really a 
>sizeable percentage of anyone's gprof output?  If not (and I'd be 
>surprised if it was actually) I would think our time would be better 
>spent in other performance improvements like the NCSA1.4-style hunt group 
>daemon model....

No, its all the CPU processing and file stat'ing which you save.
Some profiling is probably needed first.

 David.