Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cocoon-dev@xml.apache.org
Message-ID: <3C1BEB15.2EDDA21A@apache.org>
Date: Sun, 16 Dec 2001 01:30:13 +0100
From: Stefano Mazzocchi <stefano@apache.org>
MIME-Version: 1.0
To: cocoon-dev@xml.apache.org
Subject: Re: Adaptive Caching [was Re: initial checkin of the Scheme code]
References: <LDEKKGCNJLEPFDOEIHFDAELOCKAA.g-froehlich@gmx.de>
 <3C1A41F0.5050509@apache.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Berin Loritsch wrote:

> However, what I see as more practical is that the specific files cached
> might change by the time of day.  For instance, web administrators usually
> have log analyzers that will easily determine which parts of the site get
> hit heaviest at which time of day.  This is very important.  For instance,
> we may want the cache to artificially lengthen the ergodic period for specific
> resources during their peak load in order to scale more gracefully as more
> users view those resources.

hmmm...

> In affect, this is similar to the approach that Slashdot uses in it's server
> clustering.  They have a couple of dynamic servers with full fledged capability,
> however the bulk of the site's use is reading the front page or possibly the
> extended story and comments (most users do not post).  This allows the
> Slashdot team to have a few servers that only have static pages--that are
> updated every 20-30 minutes.  That is quite an ergodic period for a news
> site.  As load increases to the point that the dynamic servers cannot sustain,
> additional static page servers are brought into the cluster.

Ah, ok, good example.

> This is a macro scale of what I was referring to, but it is similar to the
> concept of dropping packets to allow the server to slowly degrade as load
> increases instead of come to a screaching halt.  

Yes, good parallel.

> Such an adaptive cache would
> be _more_ concerned with slowly degrading performance by serving stale data
> rather than ensuring the data retrieved from the cache is the most up to date.

Good point.

I see two ways of achieving this transparently:

 1) tune your Cacheable behavior for those resources that need this
 2) add a site-wide Cacheable behavior that is summed to the other
Cacheable events

of course, we must have a way to measure *machine load* from java both
fast and reliable (probably a sampling thread asking for 'uptime' or
similar and storing this someplace?)
 
> This would be an adaptive cache that *could* possibly have pluggable or adaptible
> policies regarding stale data.  For instance, a site like Slashdot can get away
> with marginally stale data as the average user is not constantly using it
> in day to day work.  However, such a policy would be vey detrimental to a
> corporate web site that managed the daily workflow of information that is
> core to the company's needs.  It is also detrimental to web sites that require
> that users cannot see each other's data, where privacy and security are
> chief concerns.
> 
> However, for sites like D-Haven (if I get the time) is supposed to be, stale
> data would be acceptable because the content would have a naturally longer
> ergodic period to begin with.

If you already know your site is not going to change for a given period
of time, well, batch generate the site and be happy (like we do for
*.apache.org), if parts are dynamic and parts are not, use both, if
everything is dynamic (yes, there are cases where *everything* needs to
be dynamic), you have to tune your cache.

By having pluggable Cacheable logic, you can even say that, if load is
higher than a certain amount, the cache is still valid, no matter what
happened to the real data.

The IoC cache design allows you to do what you want and the adaptive
caching even to turn caching off for those resources where even 'trying
to cache' is more expensive than not even trying.

But even adaptive caching will not remove your needs to optimize and
design a system in order to scale. No matter how smart it is.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org