httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Pane <>
Subject Re: [PATCH] mod_cache: support caching of streamed responses
Date Mon, 02 Sep 2002 21:19:30 GMT
Graham Leggett wrote:

> Brian Pane wrote:
>> In the "shawowing" case, we'd also need a way for all the requests
>> reading from the same incomplete cache entry to block if they reach
>> the end of the incomplete cached brigade without seeing EOS.  I guess
>> we could do this by adding a condition variable to every incomplete
>> entry in the cache, so that the threads shadowing the request could
>> block if they'd sent all the data available so far.  And then the
>> thread that was actually retrieving the resource would signal the
>> condition variable each time it added some more data to the cache
>> entry.
>> But that's way too complicated. :-)
>> What do you think about the following as a low-tech solution:
>> * Keep the current model of only putting complete responses in
>>  the cache (at least for now).
> This was exactly how the old cached worked - which would mean we just 
> rewrote the cache to have exactly the same design flaws as the old 
> cache did, which is a big waste of time.

Sure, by that metric, it's a waste of time: the current code can't even
cache responses that arrive in multiple brigades, which is a prerequisite
for shadowing incomplete requests.  But so what?  The cache is very much a
work in progress; we should judge it by where the code is going, not where
it is today.

> This problem causes an annoying race condition: when a cached object 
> expires, for a short time from expiry to cache-complete all requests 
> go through to the backend. This causes load spikes, which on expensive 
> backends can be really annoying (it has been reported so in the past).

For the expiration case, there's a much easier solution than shadowing the
incomplete response.  Add a new state for cache entries: "being_updated."
When you get a request for a cached object that's past its expiration date,
set the cache entry's state to "being_updated" and start retrieving the new
content.  Meanwhile, as other threads handle requests for the same object,
they check the state of the cache entry and, because it's currently being
updated, they deliver the old copy from the cache rather than dispatching
the request to the backend system that's already working on a different
instance of the same request.  As long as the thread that's getting the new
content can replace the old content with the new content atomically, there's
no reason to make any other threads wait for the new content.

> A key design feature of the new cache was to allow "shadowing" to be 
> possible, ie a partially cached response would be servable by other 
> cache threads/processes, thus solving this problem.

Shadowing is an important feature to add (although the technique
described above may reduce the need for it somewhat).  But:

  * It's going to take a while to make the shadowing work
    (due to all the race conditions that have to be addressed).

  * In the meantime, I need a solution for caching responses
    that don't arrive all in a single brigade.  I'd like to
    get this added now, and not wait for the shadowing support.
    The code for supporting caching of streamed responses
    will give us a foundation for shadowing in the future.


View raw message