httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <>
Subject Re: mod-cache-requestor plan
Date Thu, 14 Jul 2005 13:59:11 GMT
Akins, Brian wrote:
> On 7/13/05 6:41 PM, "Ian Holsman" <> wrote:
>>a pool of threads read the queue and start fetching the content, and
>>re-filling the cache with fresh responses.
> How is this better than simply having an external cron job to fetch the
> urls?  You have total control of throttling there and it doesn't muck up the
> cache code.

> A good idea may be to have a "cache store hook" that gets called after a
> cache object is stored.  In it, another module could keep track of cached
> url's.  This list could be feed to the above cron job.  I know one big web
> site that may do it in a similar way...

that wouldn't keep track of the popularity of the given url, only when 
it is stored. I'm guessing the popularity of news stories in CNN
is directly proportional to if they are linked off one of the doors or 
if they have just been published.

other large sites (for example product reviews, or things like get most of their traffic indirectly via searches, and not 
directly from a link on a door for example. and have traffic patterns 
more like a ZipF distribution. The priority re-fetch would make sure the 
popular pages are always in cache, while others are allowed to die at 
their expense.

BTW. I'm not saying it's better, I'm just saying it's different, and 
news sites aren't the only large sites in town who need caches.


View raw message