cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tagunov Anthony" <>
Subject Re: AW: [C2]: Proposal for caching: Smarter Monitor Placement
Date Fri, 26 Jan 2001 19:14:12 GMT
Hello, gentelmen!

To contribute to this caching issues discussion,
plz let me point out something that I consider was
a bottleneck for C1:

The monitors were embedded into processors and it caused the
following difficulties:

1) When the page was actually removed from cahce, the dependency
    in the processor monitors still remained there, so
1.1) suppose request "../a.xml?b=18" was found to depend on "b.xsl"
       then, a.xml was rewritten in a way that this dependency was removed.
      Still, the "b.xsl" remained in the XSLTProcessor's monitor table denoting
      that "../a.xml?b=18" depended on "b.xsl" and when someone chainged
      "b.xsl" then the cache for "../a.xml?b=18" (provided that "../a.xml?b=18"
      was still processed by the XSLTProcessor) was considered invalid.

1.2) suppose "../a.xml?b=.." gets called with a VERY VARING NUMBER
     of parameter (I considered using it for a web resource directory,
     wich currently has thousads (and large sousands of sections,
     and the appearance of page may be user dependent). Then
     The monitors may potentially get to contain PLETHORA of
     useless dependencies for pages that have already been
     cleaned from cache and thus eat up memory.

So, what I think appropriate, is to 

     keep information needed by a
     processor (translator, transformet in terms of C2 ?) to determine
     weather it's output has chainged TOGETHER with the cached
     Then, when the page (the cached output) gets cleaned,
      then the dependencies information would vanish together with it.

And, what is no less imprtant, if one and the same processor
(translator, transformer?) gets run over the same request
multiple (2,3,..) tiimes, then the dependency information
for each stage of transform surely should be kept
separatly.  I've got no doubt that this is what is being
implemented now, but it looks so important that I dared
to stress it explicitly: the caching information (the
dependencies) should be stored PER PASS PER PROCESSOR. 

It looks like this Dependency Information 
(what a processor needs to know to calculate
if a given cached information is up-to-date or stale)
is very intimate thing
of a processor (transformer, translator).

It depends heavily
on it's type: for 

FileGenerator this will be
   -- the filename and the timestamp with which the file has been used by the FIleGenerated
      for the given piece of cached info
   -- the URL from which the doc has been retrived

XSLTTranslator (sorry if misspelled)
  -- the filename/URL of the resoulce from which the stylesheet has been obtained

XYZTranslator (that uses external data sources)
  -- some information about this external data sources

Proposal #2) 
    Why not lend a processor a placeholder to put it's OWN dependency information,
    like HttpServletReqest.set/get Attribute, what I mean, is make the contract of
    a transformer like the following (after Stuart's mail):
	give me your output (IN: input, OUT: result, IN: oldDependencyObject (may be null), OUT:
	give me a unique cache key (IN: input)
	have you changed (IN: input, IN: dependencyObject)

---------------(when i write IN: input I mean either the stream of SAX calls or URI with ?a=18
- style paramter or both, or
                     even an URI with parameters of some other stile, maybe a DOM tree -why
not, it also can converted to
                     a stirng and be included into the caching key just like ?a=18 parameters
   I mean:
     when a PROCESSOR makes a PASS to produce some potentially cacheable information
     it stores all the dependency information that it will need to determine if this information
     (provided that it got cached) is still up-to-date in the future into a SPECIAL DEPENDENCY
     This object is either 
     -- discarded right away (if the information is not cached, due to any reasons
                                          including but not limited to errors)
     -- stored together with the cached info (if the info gets cached, this may be caching

                 -- the output of this processor directly
                 -- the output of some consequent processors
               (see "isDynamic(); " initiative by Sergio for thoughts on how I descide wether
the output is cached directly or 
                after some additional steps)
     -- it is the processor that knows what type this dependency object is (for the caching
mechanism it's just Object, or
           GeneralDependecyObject from which the processors derive their individual ones)

        2.1.a.1)  the dependency object are individual for processors, 
        2.1.a.2)  the caching framework knows nothing about them,
        2.2)        the caching framework manages storing them, e.g.
                       keeps one object per pass per processor while generating the cached
                       makes them vanish exactly when the data cached vanishes

     An alternative approach
        2.1.b ) the Dependency object are actually Validators and implement some Validator
                   (or maybe let's call it Dependency interface?).
                    For any cached piece of data we keep an ordered list of Validator (Dependency)
                    objects and will call on them not on the processors to find out
                    if the cache is stale.
        2.2)      --//--//--

Best regards,
Tagunov Anthony

P.S. (I though that maybe instead of proposing v1 (see up my letter) I should have proposed
v2, here it is
	give me your output (IN: input, OUT: result, OUT: dependencyObject)
	give me a unique cache key (IN: input)
	have you changed (IN: dependencyObject)

(oldDependencyObject was in v1 to let the processor possibly reuse it)

View raw message