jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dekany <ddek...@freemail.hu>
Subject Re: refresh method or version stamp
Date Mon, 18 Apr 2005 03:14:32 GMT
Monday, April 18, 2005, 1:59:05 AM, David Nuescheler wrote:

>> User modifies template X and template Y in the repository, and then he
>> commits the transaction. So, looking at it from outside that
>> transaction, the modification of the two templates has happened in the
>> same atomic moment, right? Thus, after the transaction was committed, I
>> never want visitors to see that X is in its old state, and Y is its new
>> state. Because either both is in its new state, or both is it in old
>> state. Right? Now, if I have a cache between JCR and the CMS that is
>> updates with whatever unpredictable ways, then it can happen that the
>> cache will serve the object made for the old X, while it serve the
>> object made for the new Y. And this is phenomenon that I really want to
>> avoid.
>
> sure, no problem. since all the observation events 
> of a transaction are batched together and are transported 
> in a bundle of obervation events, they arrive precisely at the
> same time at the observation listener.

So, does it mean that an EventListener will get exactly 1 or 0
onEvent(EventIterator) call for a transaction commitment? Is this stated
in the specification? If so, that's a good think. But if you read
something with bypassing the cache, and some other thing through the
cache (or through another cache that uses another listener), then you
have a the same phenomena again... (This is central shared storage, that
can be accessed from many parts of the application in many ways, not
just trough that cache. That cache is just performance improvement, it's
in principle not even there.)

> ... but now you started to make me really curious, how do 
> you get your in-memory cache to be transactional in the 
> first place? are all the getter-method on your mycms.template 
> (java-)synchronized on your overall template cache class?

I'm not sure I understand what you are asking... You are talking about
problems coming from that I should cache multiple version of the same
template (warning: not versions ala JCR versioning, just like the old
template and the new, you see) if there are multiple transactions living
on the same time, which should see different versions of the same
template? My idea was that cache entires are not simply associated with
the path of the template, but the path + a version number, so you can
cache multiple versions of the same template. But perhaps needles to
say, this is a goal and not something that I can put down to the desk
right now... this whole cache thing is a problem I'm just trying to
solve.

The point is that everything is nice and transactional as far you are
using the raw JDBC API or JCR API. And at the moment you add a layer
that maps those raw values to objects (and often you must, right?),
suddenly you lose transactionalism. I just can't swallow it (yet? :)).
Transactions work as far as you map to simple objects, because I don't
need to cache them. Then I add the cache for the more heavyweight object
(purely for better performance, nothing new added to the basic logic of
things), and whops, transactions ruined. It's absurd. It must not be a
necessity, I think there is nothing fundamental that forces it to be
like this. Just, the storage architectures has design problems (like,
they are simply archaic, designed for other purposes), or I'm too stupid
to find out how to solve this thing on top of them. (And actually, look
at ZODB, it seems it has an efficient cache that stores heavy weight
"custom objects", and still it remains transactional. So... maybe Zope
guys are cheating somewhere?)

> anyhow let's say while responding to the http-request you 
> access the template X from your template cache twice with 
> commit in between, how should the application behave? 
> once old and once new, within the same http-response??

The idea is that you have a repository that store the templates and
other objects, and whenever you access in the repository you must be in
a transaction. So for example, if you want to work on a snapshot of the
repository within a http-reponse, then you start a transaction and you
don't close it until the response is done. (Of course it only works if
the transaction has sufficient transaction separation level.) Now,
wether it can be solved in practice currently, I don't know, but this is
what I ultimately try to achieve. And BTW, as far as I know Zope (so
what I will say is only 90% sure), it does this; each HTTP request
processing is automatically enclosed into an individual transaction.

> or let's say you have a web-page with two pictures, 
> both rendered by template Y...

That's hitting below the belt, because of the way HTML works (3
individual HTTP requests needed, and you certainly don't want a
transaction to span over multiple HTTP requests)...

> you see where i am going... i think this whole discussion is
> absolutely pointless.

See above what's the point. OK, you can say, at least in the case of
that CMS, users will hardly run into a problem because of these
"transients" in practice. You may say that the chance of an important
error happening because of them is lower than that the server will catch
fire, so who cares... anyway you have more important things to fix
around, so better if you invest your energies there. In that sense it
surely can be said pointless.

>> Another example... I have updated template X (committed the
>> transaction). Then I go and visit the page that uses that template. And
>> I don't want to see the old template, because the cache updates itself
>> with 10 seconds of delay and like. This is another phenomenon that I
>> want to avoid.
> 10 seconds? our cache is observation based not time based.
> in real-life we are talking milliseconds.

So, you still have that delay phenomenon. Something that doesn't exist
on the JCR API level. There, if I commit a transaction, and then later
start another transaction, I will surely see the new template, even if I
read the repository 0.1 millisecond after the transaction was committed.
Sure, who cares if it's about templates, but still...

>> Is it free from the two phenomena I have described above?
> sure, easily.
>
>> If yes, I'm all ears to hear how did you implemented it, since basically
>> that's what I'm asking here from the beginning.
> see above.

I say this solution isn't correct in general (because it doesn't
absolutely get rid any of the two phenomena I have mentioned), but I
don't debate that it's good enough for your concrete use case. Most
certainly it's good enough for mine as well. But I'm a terrible
creature, and telling people that "There is no 100% correct solution,
but for your templates and XDocBook articles and MVC controller sctipts
it's far good enough... better don't run a nuclear reactor with this
object mapping layer though.", while you feel so sure there is No Real
Fundamental Reason why it can't be solved 100% correctly, is something
that I can hardly step over. Yes, I know, I'm such a cube head... :)

>> >> In fact I have asked
>> >> for a solution originally, with the optimistic assumption
>> >> that the Expert Group who has developed JCR has considered
>> >> this obvious use-case.
>> > we used those exact use-cases to try to argue to
>> > keep synchronous observation it in the spec.
>> > maybe i was not clear before, the synchronous observation
>> > has been removed from the spec (it was completely speced
>> > out in the beginning) because of variety of unresolved
>> > technical questions for existing non-java repository
>> > vendors.
>> OK for me. But what did you (the expert group) say about the whole
>> higher level use-case, that is: the repository client wants to cache
>> those objects that are too expensive to be re-created every time a
>> repository "query" was executed. (Do you understand the use-case I'm
>> referring to?)
>
> i don't know what you mean by "query". the term query has a very
> specific meaning in jsr-170, and you seem to be using it for something
> different.

Sorry, I have used an RDBMS term. So I meant, when you get a node and
its properties. getNode(path) and some getProperty(name) on the JCR
level (but of course finally I should get a mycms.Tempalte, for
example).

> if you talk about cache invalidation, personally i think in almost any
> case the async observation provided in jcr is absolutely sufficient.
> keep in mind, we talk milliseconds and all the events of a transaction
> are a batched up.

Well, I just wanted to access a content repository as a tree of
application/framework specific objects, while transactions and such will
still work 100% correctly, like they did until I wanted to store
big-complex objects in the repository (and there are such things in a
content repository). Do I have too futuristic ideas? And finally I
should swallowing an incorrect solution because it is (supposedly) good
enough... But after all, I can be happy if the JVM doesn't dumps core
twice a month and things like that... :) So, that's it. Point for now on
my side.

> regards,
> david


-- 
Best regards,
 Daniel Dekany


Mime
View raw message