jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject Re: Handling copies and moves with tree diffs
Date Tue, 27 Nov 2012 21:30:45 GMT

Hi Jukka,

Thanks for sharing this. In a nutshell, what you are proposing to do is

* to keep no more than one move/copy operation pending and purge changes 
to the Microkernel as soon as another such operation arrives,

* in the generalised version, to only keep "unconflicting" move/copy 
operations pending and purge as soon as a conflicting one arrives.

Unconflicting means, that given any two move operations none of the four 
involved paths overlap. Where two paths overlap if one is an ancestor of 
the other.

This algorithm - though less general than the one I proposed - seems 
like a good choice for the time being. It is much clearer how to 
implement it in the existing code base. If it turns out that we need a 
more rigorous move/copy handling, we have still the other approach as a 

However, both our approached share a common problem: they don't work 
across a RootImpl.rebase since after rebasing the tracking information, 
which was initially available through the node builders, is lost. 
Pushing the rebase logic down to the Microkernel would solve this 
immediate problem but would lose us the ConflictHandler, which is 
currently used to annotate conflicting commits and which allows users to 
edit these before retrying the commit.


On 27.11.12 14:43, Jukka Zitting wrote:
> Hi,
> On Tue, Nov 27, 2012 at 12:56 PM, Jukka Zitting <jukka.zitting@gmail.com> wrote:
>> I did some work along these lines already earlier with the
>> CopyAndMoveAwareJsopDiff class you can find inside KernelRootBuilder,
>> but couldn't make it work properly. Based on discussions with Michael
>> last week I think we should be able to come up with an algorithm that
>> works pretty well for this purpose. More on that in follow-ups.
> Here's one simple algorithm that should be able to correctly handle up
> to a single copy or move per commit. First the handling of changes in
> KernelNodeBuilder:
> A1) In KernelRootBuilder we keep track of the path of the *target* of
> a copy or move operation. Until such an operation occurs, that path
> remains null.
> A2) In KernelNodeBuilder.setNode() we check if the new child NodeState
> is based on the same MK revision as the parent but has a different
> path than the identified new child (and is not one of its
> descendants).
> A21) If not, we use normal tree diff to find out which basic
> add/set/remove operations to update the builder subtree to match that
> NodeState.
> A22) If yes, we update the target path in KernelRootBuilder to point
> to the new child node and use the given NodeState as the base of that
> builder subtree.
> If A22 is  reached when the target path in KernelRootBuilder is
> already set, all changes are automatically purged to a branch and the
> KernelRootBuilder reset to the new purged state before continuing the
> A22 step. Changes are also purged whenever a predefined purge limit is
> reached or commit is requested from a higher level. The JSOP diff for
> that purge commit is constructed as follows:
> B1) If the target path in KernelRootBuilder is set (and the
> copied/moved node at that path still exists), we first construct the
> relevant copy or move operation:
> B11) If the target bath already existed in the previous revision,
> we're dealing with a replacement. Emit a JSOP delete instruction for
> that path.
> B12) If the original path of that subtree still exists, we're dealing
> with a copy operation. Emit  the JSOP copy instruction.
> B13) If the original path no longer exists, we're dealing with a move
> operation. Emit the JSOP move operation and keep track of the original
> removed path.
> B2) Process the rest of the changes using the normal JsopDiff
> mechanism with the following modifications:
> B21) When encountering the target of the copy/move operation, instead
> of processing it normally as an added or modified subtree do a
> separate JsopDiff against the base state from the original source path
> to capture any post-copy/move changes to that subtree.
> B22) When encountering the removed path from B13, suppress the normal
> JSOP delete instruction that would have been generated by the standard
> JsopDiff class.
> It should be straightforward to generalize this algorithm to cover any
> number of *unconflicting* copies/moves in a single commit (conflicting
> changes would still need to be handled as a sequence of purged branch
> commits). Most notably this approach should be able to accurately map
> all JCR-level direct-to-workspace copy/move operations (including more
> complex things like checkin) to single MK commits.
> BR,
> Jukka Zitting

View raw message