jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dekany <ddek...@freemail.hu>
Subject Re: refresh method or version stamp
Date Sun, 17 Apr 2005 23:14:20 GMT
Sunday, April 17, 2005, 11:09:24 PM, David Nuescheler wrote:

>> >> >> - jcr:modificationCounter which is automatically created and
>> > just to understand your architectural approach:
>> > if i understand you correctly you would then in your
>> > "cache-system" check that property "jcr:modificationCounter"
>> > before serving anything from your cache. is that correct?
>> Yes.
>> > and that means that the jcr client would on every access to
>> > your cache ask the "repository server" about the value of
>> > that "jcr:modificationCounter" over network? is that correct?
>> Exactly. And, the jcr:uuid, otherwise it worth nothing.
> ??? i don't understand. ???

You don't understand what? Why the uuid is needed? Because, I want to
know if the node that is accessible with the "foo/bar/baaz" path was
changed since I have cached the "custom object". Is that clear till now?
To know that, I have to compare the jcr:modificationCounter with the
value stored in the cache, and also the jcr:uuid with the value stored
in the cache (the cache entry encapsulates a modificationCounter, an
uuid and the "custom object" itself). The comparison of the uuid-s is
required because it can easily happen that you replace the
"foo/bar/baaz" with another node, whose jcr:modificationCounter happens
to be the same.

>> > on every read operation to your cache?
>> You see, it is still far better than getting jcr:content on every read
>> operation (which supposedly a long string or binary), then create a
>> mycms.Template or org.w3c.dom.Document or etc from it, for each occasion
>> when you read it.
> i think you are missing a whole lot. 
> what does jcr:content have to do with your mycms.template?
> were you going to extend your mycms.template from nt:file?
> and if yes, why?

It's completely irrelevant here how do I *exactly* store that template.
If you don't want me to use jcr:content for storing the "body" of the
template, then I don't do it, I don't care now. But OK, then, just for
the sake of example, I define a node type mycms:template, that will have
2 mandatory properties: language and content. Content stores the
template itself, and language stores the template language it uses.

> i would have seen that as a properly structured mycms:template
> nodetype.

Hey, then we agree.

> btw. you realize that a modification of a property or 
> property of a child node does not automatically 
> modify parent node.

Yes I do. I guess why you ask this. My quick idea was the node should
look like this:

 +- indexPageTemplate [mycms:template]
 |   |
...  +- jcr:uuid [String]
     +- jcr:modificationCounter [Long]
     +- language [String]
     +- content [String]

This node corresponds to a mycms.Template Java object. So, whenever the
"language" or the "content" property is changed, I want the
"jcr:modificatuinCounter" be incremented. Not because the node itself
was changed, but because the mycms.Template Java object corresponds to
the node is now become outdated, since it is created based on the values
of the property children of the node.

Anyway, again, it was a quick idea, it was never elaborated in detail...
maybe it is not how the problem of that cache should be solved. I don't
know, I'm not the JCR expert here, I just have some ideas.

>> > is that really what you suggest?
>> Yes, but it was not really a proposal or something like that. Simply,
>> there is a task that can't be avoided (IMO), i.e. this sync-ed cache
>> must be implemented *somehow*, because of the use-cases explained during
>> this thread. If anybody has a better idea, tell it. 
> for the use case that you describe i think an asynchronous 
> observation listener is just fine. 
> i don't see why it is of utmost importance that the
> cache of your mycms.template is updated in a transactional 
> fashion with the back end. practically, i can't even come up 
> with a drawback of async cache invalidation for a template cache. 

I guess you mean something different with sync cache invalidation than
me. I don't want the cache entry be invalidated right before the
transaction committed was finished. I just don't want the cache to
*return* object that are outdated (with other words, out-of-sync, and
hence my misunderstandable term "sync. cache"). But maybe it is better
if I show again with exact example what I meat with a "synchronous
cache", what are the my expectations from it:

User modifies template X and template Y in the repository, and then he
commits the transaction. So, looking at it from outside that
transaction, the modification of the two templates has happened in the
same atomic moment, right? Thus, after the transaction was committed, I
never want visitors to see that X is in its old state, and Y is its new
state. Because either both is in its new state, or both is it in old
state. Right? Now, if I have a cache between JCR and the CMS that is
updates with whatever unpredictable ways, then it can happen that the
cache will serve the object made for the old X, while it serve the
object made for the new Y. And this is phenomenon that I really want to

Another example... I have updated template X (committed the
transaction). Then I go and visit the page that uses that template. And
I don't want to see the old template, because the cache updates itself
with 10 seconds of delay and like. This is another phenomenon that I
want to avoid.

> as a matter of fact we implement in our own commercial 
> cms (which powers some of the worlds most high-profile 
> websites) the template cache invalidation though asynch 
> observation. works perfectly fine.

Is it free from the two phenomena I have described above?

If yes, I'm all ears to hear how did you implemented it, since basically
that's what I'm asking here from the beginning.

If not, and state that it is OK, then I find it's pathetic that while it
is widely accepted that transactions and such "ACID things" are
important in trillions of use cases, they are suddenly somehow not
important at all if we are talking about content like templates,
scripts, XML document-s. Yes, I admit it is seldom a hard problem with
HTML templates, but I'm not just talking about HTML templates. Consider
if you modify the script that prepares context for the HTML template,
and the HTML template. The new template may will not work with the old
script, and the old template may will not work with the new script. And
both templates and scripts are things that you definitely want to cache,
and not re-parse from the strings stored in the repository, for each
page visiting.

> i would agree, that in other cases, it may be desirable to have 
> synchronous triggering of events to solve the caching issues in a 
> transactional fashion, which is exactly what we use in jackrabbit 
> wherever we require it.

Just to clarify it: I don't necessary want synchronous observation
events (and when I say "sync. cache" it is not to be mix up with
synchronous event notifications). I don't necessary want
jcr:modificationCounter eihter. What I want is just to implement that
"synchronous cache", and it seems it simply can't be done on top of JCR.

> the network overhead that you encounter with your "polling"-solution
> is ridiculous compared to (synch or asynch) observation considering 
> that you stated earlier that the cached information changes very 
> rarely and is read very frequently. 

All solution I have mentioned, which let you implement that cache, was
clearly decreased the network load compared to the situation where you
can't use that cache (because you can't implement it). And I didn't said
that some kind of synchronous observation wouldn't possibly solve the
problem better, I'm just saying that the observation feature as it is
existing now can't be used for that, or at least it is rather
problematic (Jackrabbit only and such).

>> In fact I have asked
>> for a solution originally, with the optimistic assumption 
>> that the Expert Group who has developed JCR has considered 
>> this obvious use-case.
> we used those exact use-cases to try to argue to 
> keep synchronous observation it in the spec.
> maybe i was not clear before, the synchronous observation
> has been removed from the spec (it was completely speced 
> out in the beginning) because of variety of unresolved
> technical questions for existing non-java repository 
> vendors.

OK for me. But what did you (the expert group) say about the whole
higher level use-case, that is: the repository client wants to cache
those objects that are too expensive to be re-created every time a
repository "query" was executed. (Do you understand the use-case I'm
referring to?)

> nevertheless every repository implementation is free 
> to support synchronous observation, and you seem to 
> be in luck since jackrabbit does ;)
> regards,
> david

Best regards,
 Daniel Dekany

View raw message