cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carsten Ziegeler" <>
Subject AW: [C2]: Proposal for caching
Date Wed, 24 Jan 2001 07:24:49 GMT
> Sergio Carvalho wrote:
> > 
> > 1. Caching should not take place after each stage in the pipeline.
> +1. 
> Caching should take place at the end of every simple pipeline. 
> Content aggregation causes the aggregation of several simple 
> pipelines, and each of these must have a separate cache, or else 
> the invalidation of one of these invalidates all the others -- 
> Not Good(TM)
> > 2. The most static part of the pipeline should be cached.
> ?!? The end result should be cached. 
If only the end result is cached, many pipelines would never be cached,
as the pipeline is only "static" up to a specific point.
Imagine a pipeline with a generator, four transformer and one serializer
at the end. The generator reads a static file, the first three transformers
do "static" transformations and only the forth is dynamic.
Then it would be possible to cach the result of the generator and the
three transformations. 
This approach is necessary for content aggreation, as there is no serializer
at the end of each pipeline.
> > 3. The logic for caching is either in the ResourcePipeline or "above"
> The logic that decides if the result has changed should be on the 
> producer(s) and should be overridable.
Producer? Do you mean generator (the C2 equivalent of a producer)?
I didn't mean the logic for hasChanged, but the logic which calls all
hasChanged and all the other methods required.
> > 4. If a component is cacheable it should implement an interface 
> Cacheable.
> > 5. Caching is done until the first component in the pipeline is 
> not Cacheable.
> Rephrasing: Caching is done if every component in the pipeline is 
> Cacheable. Remember, no caching on each stage.
OK, I didn't mean caching on each stage, see point 2 for an example. 
The caching is not done at each stage, only at one specific stage in 
the pipeline. Until this stage the content is "more" static than
in the rest of the pipeline.
But this stage is not necessarily the last stage after the serializer.
> > 6. Identifying if the cache contains the response (or the most 
> static part
> >    of it) can be done by a unique key. The Cacheable interface 
> implements
> >    a getKey() method. The keys of all Cacheable components are 
> chained together
> >    and build the unique key for the cache.
> -1. If Stuart is right, and transformers' output depends only on 
> their input, then you just need to check the inputs, and you just 
> need to cache the final output.
Ok, in some way the transformers output depends only on their input.
But what is the input of a transformer? On the one hand the XML stream,
which comes from the previous component in the pipeline.
But e.g. the SQL Transformer fetches data from a database. The XML 
stream for specifying which table and rows to fetch is static and 
changes never. So the XML stream is static. But the result of the
fetch changes over time as the data in the database changes.
So this "input" can be highly dynamic. 
> > 7. Cacheable also delivers a Validator object (getValidator()) 
> which contains all
> >    information required for this component to test if the 
> content has changed
> >    since the last caching.
> >    The FileGenerator e.g. puts the last modification date into 
> the Validator.
> >    The Cache stores all Validators together with the unique key.
> +1. I'd like to see a setValidator also, so that the Validator 
> can be overriden.
> > 8. All Cacheable components are now asked hasChanged(Validator). If all
> >    respond with false, the cache value is valid and is used. 
> >    If the first responds with true, a new response is 
> generated, the validators 
>    are get and the result is put in the cache.
> -1. See 6. Only the inputs (generators) must be checked.
I disagree, see 6. and also the RT for caching:
and some of the mails from the XSP Generator task:

> > 
> > This approach is very flexible. Each component nows if it is 
> cacheable and when
> > it is invalid. Using the Validator object it is possible to 
> specify something 
> > like: This component is really 6 hours valid wether it changes or not.
> Please, allow the Validator to be overriden. I'm imagining 
> situations like an HTTPGenerator where the Generator doesn't know 
> if it is valid and can only safely assume it is not. In these 
> cases, there are situations where I'd like to override this 
> default behaviour.
OK, can you please explane, who exactly will override this behaviour.
I could imagine a configuration for the HTTPGenerator which 
configures the way the Validator is build. E.g. a simple approach would
be, that the HTTPGenerator only reads the input once and assumes that
that is static. A configuration (or better a parameter in the pipeline)
could tell the HTTPGenerator that a specific HTTP request is not
static, but changes all 3 hours, so it could look like:
    <map:generate src="http://myserver/resource.xml" type="http">
        <parameter name="content_validation_period" value="3 hourse"/>

The validator object would contain the last time, the HTTP request was
done, checks if the 3 hours have expired since then and responds
according to this information to the hasChanged() method.

The same configuration is possible/usable for transformers, e.g. the
SQL Transformer. The SQL Transformer knows, when to refetch data
from a database by a configuration.


View raw message