jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cris Daniluk" <cris.dani...@gmail.com>
Subject Re: Questions about TX in Jackrabbit, JTA and Spec compliance
Date Thu, 02 Aug 2007 17:42:53 GMT
> Cris, thanks a lot for your comments: they helped me understand what
> Marcel's concerns are about the way Jackrabbit implements XA.
>

I think this is a good discussion to have in general. Marcel and I
both have some concerns, definitely - but I'm not yet sure they're
valid, as my points below show. Basically it just comes down to the
level of sophistication in Jackrabbit's journaling.

> > Marcel's point here is that the JTA implementation doesn't allow the
> > RDBMS transaction to participate in the XA. I can see a good argument
> > for this - after all, Jackrabbit maintains an effective journal and
> > not all RDBMS can participate in XA.
> >
> > That said, at the truest definition of a transaction, does just
> > writing to the changelog truly constitute a guaranteed transaction?
> > What if the RDBMS cannot be written to due to an integrity violation?
> > I don't think the cohesion between the RDBMS and the Jackrabbit
> > implementation are so tight that it is fair to argue any inconsistency
> > would be similar to datafile corruption.
>
> Where would that integrity violation come from? If you think of some
> clustered environment, and some other node in a clustered environment
> has made some modifications that changed the same items this node is
> trying to update, it will get informed about the staleness of its copy
> and throw. IMHO, looking at the very basic data model Jackrabbit uses
> and if we rule out other programs that tamper around with the data, I
> don't think this should happen.
>

In Oracle, a committed transaction means that it is in the redo log,
but not necessarily written to the tablespace. However if you combine
the tablespace+log, you are guaranteed to get a consistent
point-in-time view of that transaction. Oracle could, and often does,
have trouble writing from the log out to the tablespace (corruption,
insufficient space, whatever), but there is no loss of data. You can
further back up to that transaction and regardless of the location
(tablespace or log) you are covered. I realize that this is a pretty
crappy, simplified description of journaling, but it might help frame
our discussion.

My concern with Jackrabbit is whether the changelog is a true journal
or a mere queue for the database. For example, once a transaction is
committed and written to the changelog, but before it is written to
the RDBMS, is it part of the "logical view"? In other words, if I
query JR before the flush to the DB, will I see my newly committed
data? If I crash before the RDBMS write happens and start up, am I
safe?

>
> I'm not sure if I you get there: do you suggest that Jackrabbit, when
> used with XA, uses some DB connection that is itself part of the same
> XA transaction and managed by the transaction manager?

This is what I'm suggesting for discussion, though I'm not necessarily
at a point where I'm suggesting a change be made :)

> I could then
> imagine that some change made inside Jackrabbit to the database will
> later be revoked because another part of the XA transaction has
> failed, without Jackrabbit noticing it, which would leave to
> inconsistencies.
>

If the XA includes Jackrabbit AND the RDBMS AND any other outside
participants that may be relevant, it could not be rolled back without
Jackrabbit knowing. I'm not sure I understand where Jackrabbit could
be "left out of the loop" on a rollback?

Thanks for your responses thusfar!

- Cris

Mime
View raw message