jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject Re: [jr3] clustering
Date Thu, 01 Mar 2012 17:05:27 GMT

>>> how does MVCC fit into this? multiple revisions of the same
>>> JCR/MK node could be stored on a B-tree node. whenever
>>> an update happens the garbage collection could kick in an
>>> purge outdated revisions. providing a consistent journal across
>>> all servers is not clear to me right now.
>> I think MVCC is not a problem as such. To the contrary, since it is
>> append only it should even be less problematic. IMO garbage collection
>> is an entirely different story and we shouldn't worry too much about it
>> until we have a good working model for clustering itself.
>> Wrt. the journal: isn't that just the list of versions of the root node?
>> This should be for free then. But I think I'm missing something here...
> the model I have in mind doesn't have root node versions that
> correspond to MK revisions. Is this mandated somehow by the MK
> API design?
> in my model only the nodes that changed get new revisions.
> and reading from the tree with a given revision means it
> will pick the revision which is less or equal to the given revision.
> e.g. if you have a node /a/b/c which was changed three times
> in revision 2, 7, and 12 and a client reads at revision 9. the
> implementation will return revision 7.
> I don't see a need why the parent node needs to be updated
> when a child node is added, removed or updated.

Hmm I see. I came up with a similar approach loooong time ago. Even 
before the Microkernel. Anyway, I think the Microkernel API does not 
mandate root node versions corresponding to revisions. In fact I think 
the approach you are proposing will scale better wrt. write contention 
on the root node since there is no need for writing a new root node on 
every write operation. However, getting a consistent journal across 
cluster nodes seems more difficult here as you said.

>>> How does backup work? this is quite tricky because it is
>>> difficult to get a consistent snapshot of the distributed
>>> tree.
>> MVCC should make that easy: just make a backup of the head revision at
>> that time.
> hmm, I'm not sure that will scale. consider a large repository
> where traversing all nodes takes a long time.
> I think backup should be supported at a lower level to be
> efficient.

Hmm right, that makes sense.


> e.g. something like proposed in [0] 4.9.
> regards
>   marcel
> [0] http://cs.ucla.edu/~kohler/class/08w-dsi/aguilera07sinfonia.pdf

View raw message