jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <mreut...@adobe.com>
Subject Consistency aka Isolation Level (was: OAK-638 Avoid branch/merge for small commits)
Date Tue, 05 Mar 2013 09:27:33 GMT

I think it's better to discuss on the mailing list instead of an already
closed issue...

Michael wrote:
> It should be possible to combine this with the "branch, rebase,
> fast forward merge" approach I described above : we just need to
> make the fast forward merge a bit more clever such that it could
> detect and merge changes in distinct areas of the tree instead of
> just giving up when there was concurrent change.

Right, but then why do we branch and rebase in the first place, when
we need the merge (and commit!) do the same (the more clever
bit you mentioned) in a concurrent write scenario?

The only reason I see right now is consistency we gain with the
conflict handling Michael proposed [0] for the MicroKernel API
and Jukka implemented in the SegmentMK with the NodeStore

Jukkas implementation of the conflict handling provides
Serializable Snapshot Isolation because it re-runs the hooks on
a rebased branch whenever a concurrent change is detected
before it merges. If we allow it to be more clever, we lose
the Serializable part of the isolation level. Though, I think that's
quite OK. See below.

With the MicroKernel the story is a bit different. E.g. the conflict
handling model allows the commit to internally rebase, but
without re-running the commit hooks. validation and changes
performed by the commit hook are already part of the JSOP.
This means the MicroKernel will exhibit 'write skew' and not
provide Serializable Snapshot Isolation. This was already described
a while ago by Michael [1].

I think this is OK and we shouldn't require the MicroKernel nor 
the NodeStore to provide more strict consistency guarantees
than Snapshot Isolation. I don't see how we can add the Serializable
part without severely impacting throughput in a distributed write

If we still need Serializable Snapshot Isolation for some of our
Validators, we can use other techniques [2] to ensure consistency.
Materializing the conflict is one option and sometimes just happens
automatically. E.g. with a unique index on jcr:uuid we can ensure
consistency (every referenceable node has a unique UUID) even
with an implementation that only provides snapshot isolation.

What I propose, is to make this explicit in the JavaDoc of the MicroKernel
and the NodeStore API. Specifically the NodeStore API lacks quite some
details. Maybe we can just reference the relevant parts in the MicroKernel.


[0] http://wiki.apache.org/jackrabbit/Conflict%20handling%20through%20rebasing%20branches
[1] http://wiki.apache.org/jackrabbit/Transactional%20model%20of%20the%20Microkernel%20based%20Jackrabbit%20prototype
[2] http://en.wikipedia.org/wiki/Snapshot_isolation#Serializable_Snapshot_Isolation

View raw message