cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [RT] Adaptive Caching
Date Thu, 17 Jul 2003 22:03:30 GMT

On Thursday, Jul 17, 2003, at 12:48 America/Guayaquil, Berin Loritsch 

> Berin Loritsch wrote:
>> The concept translates over to the adaptive cache approach at almost 
>> 1:1.
>> The actual individual cost sample is not important.  It is the 
>> windowed average
>> that is important.  For our cache a simple mean would best suit our 
>> problem
>> space.  The size of the window for our samples will control how 
>> responsive
>> our cache control algorithm is.  This would affect your cache 
>> prediction
>> algorithm as the value k is no longer needed to control 
>> responsiveness. The
>> value k can be set at 0, 1, or 0.5 depending on how you want to 
>> affect the
>> slope.  IMO, the window size should be the determining factor at how 
>> the average
>> cost for a resource will converge toward the optimal point.  Large 
>> window sizes
>> provide slower response, while a smaller window provides faster 
>> response.
> BTW.  If we use "0" for the value of "k", then the cache prediction 
> algorithm
> as is defined by your "discriminate caching" sequence would be a purely
> random result.  The c(eff) function would solve down to the constant 
> 0.5,
> and since we discriminate based on a randomly generated number being 
> less
> than the result of the function, we have a 50/50 chance 
> (theoretically) that
> the random number would be greater than 0.5.  I say theoretcially 
> because
> not all random number generators are truly random, or even mostly 
> random.
> If we use "1" for the value of "k", then the prediction algorithm 
> follows
> the efficiency value much better.  There is a higher chance of having a
> higher number for the cache prediction, which means we will be 
> evaluating
> more entries.  In fact, as the value for "eff" reaches +/-38, we will 
> always
> be caching because the result of the equation will always be 1.
> If we use "0.5" for the value of "k", then the prediction algorithm 
> still
> follows the efficiency value, but our range is now to +/-76.

You analysis is fully correct.

> The question then is, where statistically will this efficiency 
> algorithm be all the time?

The idea behind such a design is that if a resource has been efficient 
(remember that efficiency in this context means "how much costs we 
saved so far on that resource") the chance of avoiding caching should 
be very small because it's very likely that the non-cached version will 
be much less efficient. Of course, the same is true for negative 
efficiencies (means that the cache is inefficient on those resources).

The important concept is that even if the cache is very efficient on 
that resource (thus efficiency is positive and high), we will be 
sampling the other side of the fence to understand if its behavior has 

Imagine the peak-hour case: the efficiency for a particular resource 
is, for example -50 (means that by "not caching" we saved 50 cost units 
[the dimension of which depends on the cost function]), this gives us, 
roughly a 99% probability of avoid caching that resource. But on that 
one request, we find out that caching effiency for that single resource 
was +5.

This changes the overall efficiency to -45, thus increasing the chance 
of caching to 1.5% (I'm making up the numbers but you get my point). 
The next cached request, another +5 cost units are saved, turning into 
-40 and the caching change to 2%.

If the caching trend continues, the efficiency will eventually swing 
positive, changing the caching behavior adaptively.

If the caching trend was just occasional, this will be simply a 
perturbation that will be absorbed by the adaptivity of the system.

The hard thing to sell in this algorithm is that even if you *know* 
that you are going to be inefficient on a particular resource, you are 
going to be, in order to obtain information on *whether* of not your 
prediction was right.

This tradeoff seems like a sin to the eyes of many and I believe that 
it is exactly this vision that prevented truly adaptive caches to be 

But, of course, I might be completely wrong.


View raw message