httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davi Arnaut <>
Subject Re: mod_disk_cache summarization
Date Tue, 24 Oct 2006 15:32:29 GMT
Graham Leggett wrote:
> On Tue, October 24, 2006 3:46 pm, Joe Orton wrote:
>> That's not the point - the scary complexity of all this is that it's
>> become a multi-process synchronisation problem - what do you do when the
>> writing process SIGSEGVs or hangs?  You're left with N processes hanging
>> around indefinitely waiting for data.
> Why indefinitely? The same as if the origin server disappears or hangs -
> you time out.
>> Or similarly, how do you tell the
>> difference between a stale temp file left by a previous process and an
>> actively-being-written one?
> At worst a force reload will invalidate the entry - standard procedure for
> current caches.
> At best if a follow process suspects the writer has gone away, try and
> open the cached file with an exclusive write access - if that succeeds,
> the writer has gone away, invalidate the entry. Wire this up to the
> timeout above.

I've been thinking about this lately. Suppose:

1) we have two cache file extensions, one for fully cached entities
(.cache) and other for transient (being cached) entities (.transient)
2) and that we store the headers with cache the file as an extended
3) we write an extended attribute identifying the master (the one that
is downloading the entity) thread
4) we have a file events notification mechanism
5) every cache entity has a 32-bit cache-id

This somewhat overcomplicated, but it _might_ work. Cache hits will see
the .cache file and serve the content normally (sendfile, mmap, etc.)
and can update the headers if necessary ("http-headers" file attribute).

A cache miss process would try to atomically create the .transient file:

If successful, it should write: a "http-master" attribute containing
it's own scoreboard id, the "http-headers" attribute containing the
headers and start downloading/writing the body.

If unsuccessful, it should open the file and install file notifications
events for rename, write and delete.

File events:

rename - signal to other threads/processes that the entity is completed
(or fully cached). It is signaled by the master processes renaming the
.transient extension to a .cache extension.

write - signal to other thread/processes that there is new data on the body.

delete - signal to thread/processes that the content is already expired
and it should not me renamed.


http-master - is for checking if there is a live thread or process
downloading the file. This would catch SIGSEGVed processes by keeping on
the scoreboard tuples containing the process id and the cache-id.

http-headers - is used in order to have a single cache file even if the
headers grow or shrink.

Comments ?

Davi Arnaut

View raw message