httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Pane <bri...@apache.org>
Subject Re: [PATCH] mod_cache: support caching of streamed responses
Date Mon, 02 Sep 2002 20:24:59 GMT
Graham Leggett wrote:

> Bill Stoddard wrote:
>
>> Adding a callback/hooks sounds okay.  An additional problem to 
>> consider is
>> when a frequently requested resource does not return an EOS in the firet
>> brigade and it turns out to be too large for the cache. We need a 
>> negative
>> cache (what else would you call it?) to prevent mod_cache from 
>> attempting to
>> cache the same object over and over. If the object is too large, 
>> create a
>> negative cache entry.  Consult the negative cache when the response is
>> received to determine if mod_cache should even attempt caching.
>
>
> In the original design (not sure what it is now) the memory cache code 
> set aside a whole lot of brigades, and each brigade was simply added 
> to the cached brigade set.
>
> In order to ensure there was enough memory, the original design 
> limited memory caching to requests with a Content-Length header, ie a 
> predetermined length. Thinking about this some more this isn't even 
> necessary, as the cache could simply keep adding brigades to the 
> cache, and should the length of the brigades cached so far exceed a 
> certain value, it could simply toss away the entire set aside set of 
> brigades and remove the cache_in filter, thus ditching the cache attempt.
>
> The only problem with this is with requests that are "shadowing" this 
> request. ie a subsequent request that has requested a resource half in 
> the cache already and still being fetched. If the leading thread 
> ditched the caching attempt, the shadowing threads might get confused 
> as the data they are sending just got ripped out from under them. What 
> might be possible is for some intelligence to be built in to only 
> throw away buckets that are not due to be shadowed. All in all though 
> it is still easier to mandate that only objects with a predetermined 
> length can be memory cached.


In the "shawowing" case, we'd also need a way for all the requests
reading from the same incomplete cache entry to block if they reach
the end of the incomplete cached brigade without seeing EOS.  I guess
we could do this by adding a condition variable to every incomplete
entry in the cache, so that the threads shadowing the request could
block if they'd sent all the data available so far.  And then the
thread that was actually retrieving the resource would signal the
condition variable each time it added some more data to the cache
entry.

But that's way too complicated. :-)

What do you think about the following as a low-tech solution:

* Keep the current model of only putting complete responses in
  the cache (at least for now).

* Add a new config parameter to mod_cache:
    CacheMaxStreamingBuffer <bytes>
  This would set the maximum amount of content that mod_cache
  would setaside on a streaming request in anticipation of
  being able to cache it.  The default would be zero.  For
  general use, it could be tuned to a smaller value than the
  max object size for the mem_cache or disk_cache.  E.g., if
  the server is using a mem_cache with a 2MB max object size,
  and a proxied response produces a 2.1MB file, buffering and
  then discarding the first 2MB on each request would be bad.
  But if one could set the limit on buffering for streamed
  responses to a more reasonable number, like 100KB, then
  buffering and discarding that much data wouldn't be much of
  a problem.

I think this new parameter to limit the amount of setaside
data would be complementary to a negative cache: the negative
cache reduces the probability that mod_cache will buffer up
a streamed response that can't be cached, while the buffering
limit reduces the cost of buffering and discarding a noncacheable
streamed response that wasn't caught by the negative cache.

Brian



Mime
View raw message